Tuesday, August 5, 2014

Decoding eDiscovery Doublespeak

I once had a boss who was fond of calling the particulars of eDiscovery processing and production the "sausage factory," a tip of the hat to the old adage that "Laws are like sausages, it is better not to see them being made."

Any business in a service industry seeks to create the appearance that things are going smoothly. In the same way the waiter at a restaurant will never tell you that your entrée has been delayed because of a small fire in the kitchen, your eDiscovery vendor will seek to shield you from the ugly side of the sausage factory.

Most reputable vendors won’t lie to their clients. They will, however, engage in creative truth-telling, giving just enough information to keep you apprised overall without getting so specific that you see the inner workings of the operation. Sticking close to the truth while leaving just enough ambiguity to give some wiggle room is the usual modus operandi. This is generally accomplished through the use and deployment of select doublespeak phrases designed to sound just technical enough to persuade you they mean something while conveying absolutely nothing specific.

In our ongoing effort to debunk the mysteries of eDiscovery, here are a few choice examples of vendor doublespeak.

“Contention in the environment.” This ambiguous gem is used to explain delays brought about by causes your vendor would prefer not to disclose. When you hear this phrase, alarm bells should sound, as it means something bad has happened but they’re not ready to tell you what.

Example: “We were not able to make as much progress as expected with this data due to contention in our environment.”

Translation: “We can’t meet the deadline we promised to meet. This is due to (1) the fact that requirements exceed our capacity, or (2) an internal problem like a storage failure or software issue but we are never ever going to tell you that.”

“Data specific” / “Non-standard data.” When software chokes on your data for a reason that’s beyond the understanding of the developers, they explain that the problem is “data specific” or due to “non-standard data” which means they probably won’t work too hard to solve the issue, since it’s “your fault.”

Example: “These files, which are non-standard, can’t be processed. The issue is data-specific and does not affect any other files. Our developers are investigating the issue.”

Translation: “We have no idea why these files weren’t processed. We don’t know where else this problem may exist but we are guessing it’s limited to the files you noticed. A ticket has been opened with our development team which will languish in a queue for months before someone closes it because it’s become stale.”

“Analyst / operator error”  Someone messed up and the vendor can’t blame it on anything else. This is typically followed by an assurance that additional Q.C. steps have been added to the process to ensure it never happens again.

Example: “An incorrect custodian value was applied to these documents due to analyst error. We have added a step to our Q.C. process to ensure this error doesn’t occur again.”

Translation: “An analyst made a mistake they really shouldn’t have been able to make in the first place. We have added another item in a checklist which is often ignored because we’re always rushing to keep up with deadlines.”

“Rolling deliveries.” Incremental data deliveries are typically designed to obscure an inability to achieve the throughput needed to deliver your data on time. Breaking the data up at least gives you something to work with so you don’t have downtime during which you wait and silently curse your vendor.

Example: “We have begun processing your 100 GB dataset and will start rolling deliveries on Monday.”

Translation: “We can’t get through your 100 GB in time for review to start, so we’ll begin dropping increments of data into the database for review.”

“Data density.”  This meaningless phrase is used to explain any delay in indexing, regardless of the quality, quantity, or type of data involved. Any time indexing slows for any reason, it’s identified as a data problem and the cause is the “density” of the data, a quality which is impossible to measure, gauge or predict.

Example: “These documents have yet to be indexed due to the density of the data. We will continue to keep you updated on the progress of the indexing process.”

Translation: “Your documents are indexing very slowly. We don’t have an explanation of why the index is hung up, but when this has happened before it clears itself up magically, so we assume these documents are somehow ‘denser’ than others.”

“Final search sweep” / “Data audit.” These catchphrases appear when the vendor has suddenly found data that they should have found in the past. This may be due to an error or a lag in indexing, but that won’t matter to you when 50,000 new documents appear right before your discovery deadline.

Example: “Our standard data audit identified additional documents responsive to your search terms. These 34,000 documents are currently being batched for review.”

Translation: “We missed these documents. Here they are. Good luck. We will expect your request for a credit at the end of the month.”

Keep your eye out for these phrases or others like them – they reliably signify that things aren’t going as well as your vendor would like you to know, and the sooner you ferret out the whole truth the better your reality will be. At the same time, don't throw the baby out with the bathwater. eDiscovery is complex and the data can be unwieldy. While nothing should be taken at face value, this is not to say you should take your business elsewhere. Most vendors do have something of value to offer, it's up to you to make sure that's what they deliver.

No comments:

Post a Comment