Jump to content
  • 0

Infocentre Access History - 140 million rows!


jstill

Question

Hi all - been a while since I last posted but have a bit of an issue and I wondered if anyone else has come across a solution.  At the moment it's looking like an expensive custom build.

 

In short the access history is huge - we made the jump from IC to ICP a few months ago so we would no longer need a 12 hour full refresh and could just do the routine  'data refresh' rather than full bulk transfer.  In a nutshell:

 

·         Our access history has over 141 million records in it.

·         A data refresh (not full bulk export) in ICP started at 9:15 am does not finish the same day.

·         A data refresh done at 03:00 am does complete, but takes 19.5 hours.

·         A few days ago I was able to see the BAH file (access hist) was being processed at 8:50am, which is one of the last files processed in the refresh – it looks like this took 12 hours plus to process.

·         During the refresh, queries get locked and there are delays for info going into ICP – I have seen delays of over 450 seconds shown on the monitor.

·         Obviously the full bulk refresh takes a lot longer due to this (as it did with original Infocenter).

·         For the meantime, I’ve changed the data refresh to only run on a Sunday so it doesn’t impact the business.

 

The fix would be to remove a lot of the older records from the access history.  As I understand it, the tables in ICP replicate the structure found in OpenGI.  As such all we really want is a once off process that deletes records from OpenGI’s equivalent of icp_braccesshist.

 

Getting rid of records prior to 01/01/2015 would approximately halve the data we have, but I’d be inclined to get rid of everything prior to 01/01/2017, which would lose over 90 million records.

 

In short we just need the OpenGI equivalent of the below.

 delete

from

       icp_braccesshist

where

       [#date] < '20170101'

 

Has anyone else had this issue and/or found a solution?  I don't want to disabled the access hist export as we use the data for a few things.

 

Kind regards
James

 

 

 

Link to post

18 answers to this question

Recommended Posts

  • 0

Hi James - hope you are well!! Great question and I'd be glad to offer some options / advice if I can ...

 

First off - 

 

1 - Why are you doing a data refresh?

2 - What is the "custom solution" (i.e. what is the proposed development that will be completed) and what is the cost?

3 - In any event, if you purged the associated Client Policy files prior to 2015 on the server - presume the log would be pruned too - obvs? 

 

#interestingtopic 

 

 

PS - nothing, Nothing, NOTHING, NOTHING in our world should take 12 hours to process a single file. Your full back-up OF EVERYTHING doesn't take that long - I presume the bottlenecks were investigated before the "custom solution" was proposed?

 

@Tom Davies

@karl

@maskelleto

Link to post
  • 0
3 hours ago, Mark Sollis said:

Hi James - hope you are well!! Great question and I'd be glad to offer some options / advice if I can ...

 

First off - 

 

1 - Why are you doing a data refresh?

2 - What is the "custom solution" (i.e. what is the proposed development that will be completed) and what is the cost?

3 - In any event, if you purged the associated Client Policy files prior to 2015 on the server - presume the log would be pruned too - obvs? 

 

#interestingtopic 

 

 

PS - nothing, Nothing, NOTHING, NOTHING in our world should take 12 hours to process a single file. Your full back-up OF EVERYTHING doesn't take that long - I presume the bottlenecks were investigated before the "custom solution" was proposed?

 

@Tom Davies

@karl

@maskelleto

Even a bulk insert for me only takes around 15 minutes (and I thought our database was reasonably large).

Link to post
  • 0

Hi @Mark Sollis @karl,

 

Thanks for the replies, to take the points in order:

 

1.  Occasionally data doesn't go into IC tables.  There's a blip in the connection or maybe there's a SQL error (usually from a trigger (which we no longer use)) which rolls the change back.  A data refresh will push any missing records from OpenGI into SQL.  I have also found a few instances of ghost records, where records are in SQL but not in OpenGI and the data refresh doesn't fix them.  Presumably the check is only one way - I can't really see anything reading each row in SQL and checking OpenGI for it.  So to get rid of those, I think we're going to have to reinstate a regular full bulk refresh.  I don't actually know how long this will take with ICP, but hopefully not the 12 hours it was taking with regular IC towards the end.

 

2.  There isn't one at the moment.  We've been quoted 3 days for investigation / spec and 8-11 days dev cost to build it, which seems excessive given the relative simplicity of what's required.

 

3.  Purging is a big problem for us.  We're in discussions on the GDPR front now as the functions in OpenGI aren't really any good to us.  In a nutshell - you have to purge transactions / documents for specific policies based on term date, but we don't want to do that.  We want to delete all clients where their last policy expired over 7 years ago.  However we want polices that relate to still live clients to remain, as well as any EL policies.  We also want to keep prospects that are being worked on (we use Core for all clients for a bunch of reasons), so we add an extra requirement that the ToB data on the client must be over 3 years old as well.   There's another issue in deleting documents that you can't delete a one off letter without confirming them all first (which means that you can no longer read the content of them), which means you can't delete any policies / clients with a one off letter on them.  Ideally I'd like to be able to mark clients and policies for deletion and when we press purge it just deletes all records relating to those references, transactions, documents, access history and so on.  Then we could all set our own flags based on our personal criteria, via DB enquiry, or via X-Stream. 

But for now... purging is out for us.  Fair point you make though.

 

With regards to the file taking 12 hours to process, I'm pretty certain it's because it's doing a data integrity check on the 140 million rows as it's being constantly written to by ICP.  The contention on the disks must be crazy.

 

@karl What bulk import process are you referring to?  It can't be the full database refresh in that time, can it?

 

Kind regards

James

Link to post
  • 0

Thanks @JamesStill - all makes sense 

 

3 hours ago, jstill said:

1.  Occasionally data doesn't go into IC tables.  There's a blip in the connection or maybe there's a SQL error (usually from a trigger (which we no longer use)) which rolls the change back.  A data refresh will push any missing records from OpenGI into SQL.  I have also found a few instances of ghost records, where records are in SQL but not in OpenGI and the data refresh doesn't fix them

 

Thennnnnnnnnn thats a fault then isn't it?!  If the current ICP solution provides a real time transactional reporting database - then it should operate and be performant. The whole campaign around (and upgrade cost of) ICP was so that bulk insert wasn't required and a data refresh was the exception not the rule. It's what everyone paid for on the upgrade. We need to help OGI get this right and get it fixed to do what it's supposed to do - no help to you right now though ...

 

3 hours ago, jstill said:

We've been quoted 3 days for investigation / spec and 8-11 days dev cost to build it, which seems excessive given the relative simplicity of what's required.

 

Your SQL code example in the OP takes up 5 lines of code. BCPL won't be much different IMHO - but I shouldn't question the detail. But yep - 14 days all in @ £x thousands per day + VAT is a lot of dosh for the equivalent 5 lines in BCPL. 


If I was asking a builder to quote for an extension - I'd like to see the basis of the quote - so I could understand the complexities. Obviously to arrive at a figure there must be some finger in the air guestimate of the work involved. So what is the Schedule Of Works expected upon which their figure is based. No different to a code build - what are the steps and what are the associated estimate for each. Its a fair question if someone is about to spend that sort of money.

 

For a discussion with a developer on other options

  • What if the extract of the BAH was done offline - from the backup file set - and then used to populate / overwrite the SQL BAH table separately?

OR

  • How easy is it to have a new (VSAM) table that is a subset of the BAH and holds data based on a Broker Amendment parameter of n Yrs. I wouldn't be surprised if this was 5 days effort all in ...

 

 

3 hours ago, jstill said:

We're in discussions on the GDPR front now as the functions in OpenGI aren't really any good to us. 

 

The whole purge issue is one that needs looking at and I believe some of this work is being considered - worth persuing to establish the scope of change in plan / being planned to make sure it is fit for today's integrated environments i.e. IC/ICP wasnt really around when purging was such a hot topic 😐

 

22 hours ago, Mark Sollis said:

I presume the bottlenecks were investigated before the "custom solution" was proposed?

 

Whatever the position I'd explore this route first - and if OGI haven't done so or explained why that file is taking so long, then they should. If a dedicated SDD fixes this - then that will be a 3 figure solution, not a multiple of 5 (I'm being optimistic I know, but the principle holds true)

 

 

If all else fails, one option could be to split the extract file processing. The default initialises the DB but "maybe" it's possible to only initialise a table (in the pre-sql script) and then extract just the BAH table weekly. I'm sure those options have been considered by OGI as lower cost alternative solution .... and maybe they have and dismissed it for good reason. Another fair question though...

 

 

Overall - a very tech product / area and in reality, only OGI can provide the right solution regardless of my ramblings which are nothing more than conjecture. So there has to be some reliance on and trust given to, the tech provider. Nothing comes for free and you just have to be happy you are getting a fair deal. If it was my business though, I'd still be asking for all the options, alternatives and a best guess breakdown of all costs - and my "supplier" would need to justify their numbers

 

 

Hope the above helps and let us know how you get on and what route / solution you end up taking

Link to post
  • 0

Hi James

 

I run the command run icptrans from the OGI server command prompt. If ever we have an issue such as missing frames (when new ones created) or scheme headers with spaces after them (when new ones have been added) etc. this is what OGI told me does a bulk insert which I understand to completely refresh the database. It literally takes around 15 minutes before Info Centre back up and okay.

Link to post
  • 0

I'm going to chip in here and I'll start by saying I've only been "using" ICP for about 6 months and I'm terrible at it.

 

First off, as far as I understand it and for me that has to be layman's terms for ICP, the system only updates the changes to the system, thereby using minimal system resources?  It definitely shouldn't need a complete refresh unless you're enquiring on new frames that have just been added?  If you're doing a full refresh, that sounds like a system error likes @Mark Sollis mentioned earlier.  Perhaps something from IC is causing a conflict?  Would suggest that this might be a Support call rather than additional bespoke solutions?

 

Also, I've probably misunderstood what you're trying to do, plus  @Mark Sollis and @karl definitely know more than me, so might well correct me anyway!

 

However, when I had my training, I was told to use "Views" rather than "Tables" when I was setting up my reports.  As such, I end up with the option of Criteria which allows me to restrict the return of data to, say, a specific date range.  So, if I run a report, I include the table "ic_braccesshist".  I can link that to a brpolicy to see the whole Access History but then if I create Criteria for the date range I need, it only returns the relevant info?

 

Like I said, probably way off the mark, so apologies if I am.

  • Like 1
Link to post
  • 0

In respect of the Purging, this is an Open GI User issue for everyone and so will be raised for discussion at the next Committee meeting.  Would you agree @Mark Sollis?

 

Purging was designed in the day when people wanted it as an option but rarely used it.  So it allowed you to pick and choose how much you cleared, which might still need to be on offer.

 

However, personally, I would like to see a packaged "Purge" where based on a set of criteria (including the option to exclude certain policy types or frames and dates), then out of date policies will go if a live one remains on the client file but the client file remains in place but if not, the whole client file is purged.

 

Needs discussion and some open debate within the Membership but it's certainly worth investigation!

Link to post
  • 0

 

6 minutes ago, Marc Pons-Burt said:

First off, as far as I understand it and for me that has to be layman's terms for ICP, the system only updates the changes to the system, thereby using minimal system resources?  It definitely shouldn't need a complete refresh unless you're enquiring on new frames that have just been added? 

 

Agree - It should be noted however, that the ICP upgrade was promoted (sold?) on the basis the new features meant it wouldn't need a refresh when new tables / frames / insurers / execs were added - so would question any need in those circumstances tbf. 

 

Yes - once in a blue Monday moon to make sure all is in synch - but low level hiccups should be managed by the app and it's processes to "catch up" any missing changes

 

6 minutes ago, Marc Pons-Burt said:

However, when I had my training, I was told to use "Views" rather than "Tables" when I was setting up my reports.

 

As far as the views go, you are correct to maximise performance on SQL tables of a size - but this won't solve the transfer times described by @JamesStill

 

Just needs a bit of lateral thinking to get the right solution at the right cost. Just seems a knee-jerk "Ch***zillion" estimate for a solution that probably could be done better and at lower cost if the issue was better investigated and scoped :huh:

  • Like 1
Link to post
  • 0
1 minute ago, Mark Sollis said:

As far as the views go, you are correct to maximise performance on SQL tables of a size - but this won't solve the transfer times described by @JamesStill

 

@Mark Sollis I'm assuming these lead times should resolve once ICP is running as intended and without the repeated full updates?

 

If the system is "up to date", which it should be, in real time, then the return of info should be relatively quick, especially if restricted to a shorter date span?

 

Or have I got my wired crossed again?

 

I guess my take-away from this is;

 

Solution is to go back to OGI, either Support or personally, I would go to my Account Manager and ask them to look into possible problems with ICP data feed.  Maybe even ask for an Engineer to take a look in case there's something wrong with the setup causing the data return issue.

 

If they confirm it's not ICP, then perhaps look into other solutions, but for me, the bespoke solution is normally very expensive and won't necessarily leave you happy!

Link to post
  • 0
20 minutes ago, Marc Pons-Burt said:

Mark Sollis I'm assuming these lead times should resolve once ICP is running as intended and without the repeated full updates?

 

Correct - but as @JamesStill says - there are issues if refresh/insert is not run at least weekly which is what he is having to do. The whole point of ICP was this is not a requirement

 

So - I'm with you - that this is a "fault" if problems occur with replication of data - and should be reported to get fixed. Otherwise, its not doing what it's supposed to be doing :ph34r:. Meanwhile - James still need's a solution.

 

Speak to the Account Manager - on both counts and shelve the £xx,000 estimates

Link to post
  • 0

Just wanted to say thanks to all for the feedback.  One thing I have found with ICP - do not change a keyword name unless you're planning to do a complete refresh afterwards - every time you edit on of the records you'll get ICP errors in the logs and the ic_view will not update (the ICP table is ok).

 

We've also found that sometimes new fields being added do not get pushed to ICP at all (we're raising a call about that now).

 

Kind regards
James

Link to post
  • 0
2 hours ago, jstill said:

- do not change a keyword name

 

Wellllllllllllll I'm not sure #OpenGI "support" keyword changes - I suspect it's an oversight :blink: but can be useful for sure. Strange that the ICP table is OK but the view is failing - why does 'SELECT * FROM' need to know the column names :wacko:

 

2 hours ago, jstill said:

new fields being added do not get pushed to ICP

 

Yep - sounds like a fault feature that needs to be resolved 🙂

 

Link to post
  • 0
20 hours ago, jstill said:

Just wanted to say thanks to all for the feedback.  One thing I have found with ICP - do not change a keyword name unless you're planning to do a complete refresh afterwards - every time you edit on of the records you'll get ICP errors in the logs and the ic_view will not update (the ICP table is ok).

 

We've also found that sometimes new fields being added do not get pushed to ICP at all (we're raising a call about that now).

 

Kind regards
James

Yep, first thing I have to do after adding new fields is run a bulk insert 

Link to post
  • 0
On 28/02/2020 at 11:55, Mark Sollis said:

The whole campaign around (and upgrade cost of) ICP was so that bulk insert wasn't required and a data refresh was the exception not the rule. It's what everyone paid for on the upgrade. We need to help OGI get this right and get it fixed to do what it's supposed to do

 

On 28/02/2020 at 14:26, Mark Sollis said:

Agree - It should be noted however, that the ICP upgrade was promoted (sold?) on the basis the new features meant it wouldn't need a refresh when new tables / frames / insurers / execs were added - so would question any need in those circumstances tbf. 

 

Adding new fields does not (should not!) require a bulk insert. If it's the case that anyone needs to, then a fix is required from Open GI imho.

 

The whole premise for th£ upgrad£ to IC "PLUS" was to solv£ the n££d for bulk ins£rts when - tables (frames) were added, fields were changed or, new VT entries made (insurers etc)

 

We all have to demand the best solution and get OGI / help OGI to fix what is broken - otherwise it will never change and we will all be running bulk inserts everyday for the rest our lives - which is what we did before ICPlus was launched. #petpeeves

 

right.jpg

Link to post
  • 0
On 27/02/2020 at 14:52, jstill said:

Has anyone else had this issue and/or found a solution?  I don't want to disabled the access hist export as we use the data for a few things.

 

Hi James - any updates on this - really interested to see what the outcome was

 

And FYI for anyone in the same position - I've actually defined the solution that actually works 🙂 - happy days!

 

😷

Link to post
  • 0

Hi @Mark Sollis,

 

We're going with getting a custom dev done for the deletion of access history records.  Out account manager Phil Brown has done a good job getting the cost down.  Just waiting on sign off internally right now.  Fingers crossed that'll solve the problem for us.  At least we can do full refreshes without it taking forever :)

 

Hope you're all staying safe these days

Link to post
  • 0
On 27/02/2020 at 19:09, karl said:

Even a bulk insert for me only takes around 15 minutes (and I thought our database was reasonably large).

Hour and a half for us. Microsoft's Hyper-V doesn't leverage host CPUs very well at all.

Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Posts

    • There will be issues whichever way - so you need to pick the right way for you at get it right first time.   In short - and from experience  on both sides - I'd say transfer now to your server, as separate branches. This may involve some logistics for connectivity (I don't know your set-up) but on the face of it - is better than running multiple servers and multiple business processes. You'll thank yourself in the long run   Moving / migrating / transferring / consolidating to one branch - on the same server - is far, far easier to manage once the data is on one platform. Trust me on this.
    • We were thinking to merge on our system but can well foresee a few issues 🙂  
    • not to any great extent so this shouldn't be a problem but thanks again  
    • Sorry Sandra - Are you planning to merge the branches on THEIR system - or merge and migrate to your server?   The first is as everyone has said above - the latter is potentially a whole new world of complexity - x 10!
    • Just thought, do you/they use Open Attach?
×
×
  • Create New...