Unconfigured Ad Widget

Collapse

Announcement

Collapse
No announcement yet.

Here's how to get REAL speed on long batch jobs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Here's how to get REAL speed on long batch jobs

    Chuck: Thanks for you comments on using EVIs. I have not been able to find much "practical" information on using EVI's, however. Not how to create them, but the sort of information you might expect to fully utilize them. The ins & outs, dos & don'ts, tips & techniques, etc. An example of one question I have is do you need to "stack" multiple fields in one index, or is it better to create separate EVI indexes for each field? Do you or your programmer have any information you could share with us? David Daugherty ddaugherty@teamair.com

  • #2
    Here's how to get REAL speed on long batch jobs

    Don't stack multiple fields in different EVI's. The system will combine separate EVI's for you. I believe the SQE will use an EVI when it determines that 20 - 60% of the records will be selected. Use an EVI when the possible values ("selectivity") for a field are fairly limited such as USA states. You wouldn't want to create an EVI over a timestamp field for example because that would be very unique (2004-09-17-10.25.00.123456'). Chris

    Comment


    • #3
      Here's how to get REAL speed on long batch jobs

      David, We have found that you should make a separate EVI for each row. Once you start using EVIs you'll be hooked. For example, we used to run a plain old standard IBM query over a file with 40 million records. The query literally took about 2 hours to finish. After EVIs were created for the 6 fields used in the query, the query finished in a little under 6 seconds. Amazing, truly amazing. chuck Opinions expressed are not necessarily those of my employer. "David Daugherty" wrote in message news:6b16e6e9.0@WebX.WawyahGHajS... > Chuck: > > Thanks for you comments on using EVIs. I have not been able to find much "practical" information on using EVI's, however. Not how to create them, but the sort of information you might expect to fully utilize them. The ins & outs, dos & don'ts, tips & techniques, etc. > > An example of one question I have is do you need to "stack" multiple fields in one index, or is it better to create separate EVI indexes for each field? > > Do you or your programmer have any information you could share with us? > > David Daugherty > ddaugherty@teamair.com

      Comment


      • #4
        Here's how to get REAL speed on long batch jobs

        Also, an EVI may help performance even if it doesn't get used. The optimizer looks at statistics from the different indexes when deciding what method to use. The EVI and binary radix (BR) indexes stores the number of unique values in a particular field. The EVI, however, also stores the number of each individual value. Example: A field with state code. The BR will tell the optimizer there are 50 different key values in the field. The EVI will additionally tell it that there are 50,000 GA, 1,000 FL, and 5 each of the other 48 values. So, if you select GA it may choose a table scan, but if you choose AR it will do it indexed. The BR would have chosen the same method no matter what value was chosen. Looking through the posts, I don't see it mentioned that an EVI will only help in the WHERE clause. It doesn't help with ORDER BY. The multiple index approach like Chris and Chuck say, is the best way. It will create a bit map for each WHERE situation (based on the EVI if possible, or based on the much less efficient BR, if not) and then and/or the bit maps together, giving a single bit map of the exact records meeting the selection criteria. Since it's a bitmap, unless you have a REALLY large file, the entire bit map will fit in memory, cutting down on the disk access needed to process the file. When I say much less efficient BR, I only mean the it's a lot more work to create a bit map from a BR than it is from an EVI. -dan

        Comment


        • #5
          Here's how to get REAL speed on long batch jobs

          I wrote a post on EVI about three years ago. Whwn I looked at the specs it appeared to be based on Foxpro's Rushmore indexing technology. Yes, the speed is there after the data is indexed. My understanding from reading the specs was that the indexes had to be rebuilt every time any data changed. This might have changed in the last three years, I don't know. rd

          Comment


          • #6
            Here's how to get REAL speed on long batch jobs

            Actually about v4 something IBM totally re-wrote the indexing routines. The hit for many indexes is all but gone. And the EVIs are updated on the fly just like any other index. -dan

            Comment


            • #7
              Here's how to get REAL speed on long batch jobs

              Ok, thanks Daniel. rd

              Comment


              • #8
                Here's how to get REAL speed on long batch jobs

                Perhaps you'd find this article useful: TechTip: Encoded Vector Indexes (EVIs) Victoria Mack, Editor

                Comment


                • #9
                  Here's how to get REAL speed on long batch jobs

                  Great discussion! All the info. mentioned is absolutely on target. It's very important when using EVI's that you index just one field at a time. Multi-field EVI's are possible but generally are a bad idea. The best thing about EVI's is that the query optimizer finds them very "attractive". Ad-hoc query performance is a common cause of poor system performance. EVI's can directly address this issue. There has actually been quite a bit of information published on this technology. You can try the following articles if you really want to dig into this topic: "DB2 UDB for iSeries Database Performance and Query Optimization V5R1", which can be found at http://publib.boulder.ibm.com/pubs/h...zajqmst139.htm "Accelerating Your Queries with Encoded Vector Indexes", found at http://www.as400.ibm.com/developer/bi/evi.html "The Foundation of AS/400 BI Query Performance DB2 UDB Indexes", by Tom Moskalik, found at http://www.as400.ibm.com/solution/present/0412LL4.htm "Encoded Vector Indexes - Not Just for Theorists Anymore", by Amy Anderson, March 1999, AS/400 TeraPlex Center.

                  Comment


                  • #10
                    Here's how to get REAL speed on long batch jobs

                    Brian Harder wrote: > "Accelerating Your Queries with Encoded Vector Indexes", found at > <http://www.as400.ibm.com/developer/bi/evi.html> Document no longer available. > "The Foundation of AS/400 BI Query Performance DB2 UDB Indexes", by > Tom Moskalik, found at > <http://www.as400.ibm.com/solution/present/0412LL4.htm> Document no longer available. Bill

                    Comment


                    • #11
                      Here's how to get REAL speed on long batch jobs

                      I find the topic of performance issues interesting. I also find that often the topics revolve around SQL vs. Logical files. However, I never seem to hear the REAL improvement to speed, especially for large files, being discussed here. And that topic is Encoded Vector Index. One of my programmers is a fanatic about tweaking performance and he started diddling with EVI's a few years back. What he noticed is that jobs that used to take a long time were now blazingly fast. Just this morning he came into my office and told me he changed a job that regularly took 2 hours to run using Query without EVI. After the change that 2 hour job was reduced to 10 seconds! This is absolutely no exaggeration! And, what's great is that this is typical, not an abnormal situation. We had nightly summarization jobs that used to run into our 4am backup and once EVIs were introduced there was no more backlog. I recommend anyone that has large files look closely at EVIs and start using them to their benefit. chuck Opinions expressed are not necessarily those of my employer.

                      Comment


                      • #12
                        Here's how to get REAL speed on long batch jobs

                        I KNEW I should have tested those links first! Well, that's where I originally obtained those articles. Here's a couple of links that I know work (at least for now): "Encoded Vector Index and Radix Index Creation", by Michael Cain, IBM TeraPlex Center, 1998. Found at http://www-1.ibm.com/servers/enable/...aplex/evi.html The following is available as a link from the above EVI article too. The TeraPlex Center tends to be a very good source for information on EVI's because they are performance oriented and EVI's can deliver the goods. "Indexing and Statistics Strategies for DB2 UDB for iSeries", Version 3.0, by Michael W. Cain, iSeries Teraplex Integration Center, IBM eServer Solutions, Updated November 2003. Found at http://www-1.ibm.com/servers/enable/...y/strategy.pdf

                        Comment

                        Working...
                        X