[Date Prev][Date Next][Thread Prev][Thread Next]
[Author Index]
[Date Index]
[Thread Index]
[SQR-USERS Info]
[SQRUG Home Page]
Re: Inserting Large Amounts of Data
- Subject: Re: Inserting Large Amounts of Data
- From: "Wanko, Christopher G, CFCTRCFFIN" <apollo@ATT.COM>
- Date: Thu, 8 Oct 1998 09:42:09 -0400
> I utilize several LOAD-LOOKUPS for basic data retrieval.
> Read the input file (txt) and massage basic variables. I then need to
> determine if row already exists in target table (PS_JOB). Input file has
> employee's SSN and not their EMPLID so I need to join PS_PERSONAL_DATA
with PS_JOB
> to determine if row already exists.
Stop doing everything in the database. Large data center ops take this
approach:
dump both tables;
merge both flat files;
sort unique based on the data you want;
strip out the excess records from PS_PERSONAL_DATA (or grep out the
PS_JOB rows);
<now you have a file that has uninserted rows>
load the data using your scrubbed file.
A batch job using database-only might take about 2 hours to do 40 million
rows in one of my environments. After using SyncSort to scrub the rows for
redundant data, the process ran in 20 minutes. You can do more processing
outside of your SQR-database logic than you think. Flat-file processing
with a heavyweight sort utility is always faster.
-Chris