April 2024 update

Another long time since our last update! Rest assured, we have been very busy. Many of the changes since our last update are “behind the scenes”, but we have continued to add some front-facing features and improvements.

Thank you as always to our wonderful customers for your feedback and ideas.

Discovery tool

New features

  • Search aliases
  • A list of the page numbers with one or more redactions is shown when viewing a redacted document.
  • “Remember me on this device” option on two-factor authentication, to reduce the need to re-authenticate after each login on the same device.
  • Ability to browse documents with custom fields.

Improvements

  • Improved performance of browsing hidden documents in large projects.
  • Improved performance of large custom lists.
  • Improved performance of allocating large numbers of documents to users.
  • Added direct download option for virtual folders.
  • MD5 hash column added to discovery list “extra columns” option.
  • Combining redacted documents now allows editing of the combined redacted versions.
  • Combining redacted documents now combines the “look-through” redacted versions.
  • More helpful messages for a user attempting to access a restricted document.
  • Improved advanced email deduplication.
  • Improved detection of inline/background images in native emails (reduces the occasions where an inline/background images is extracted as an attachment).
  • Improved performance for merging parties on very large projects.
  • Ability to accommodate some mis-formatting/inconsistencies in production number stamps added to PDFs by other parties/systems when detecting production numbers.
  • Improved handling & normalisation of irregular characters in imported PDFs.
  • Improved ability to rotate and scale PDFs with irregular properties.

Casebook tool

  • Ability to add hyperlinked annotations to an existing PDF.
  • Smarter pagination of multi-page indexes (avoid table breaks too near the top or bottom of a page).
  • Identify the parts of automatically split documents in an index.
  • Option to restart tab numbers on each series volume.
  • Option to restart tab numbers in a series block.
  • Modified the default index template to be more consistent with the examples in the Practice Note.
  • Improved appearance of indexes with no documents.
  • Greatly improved performance when generating very large indexes (100+ pages).
  • Easier way to select references when manually hyperlinking a PDF.
  • Improved error messages for invalid references when manually hyperlinking a PDF.
  • Hyperlink errors in footnotes now state which footnote.
  • Colour selection for stamps.
  • Option to use “natural” ordering when ordering by name (e.g. 1, 2, 3 … 10, 11, 12 instead of 1, 10, 11, 12 … 2, 20, 21).
  • Badge for documents with noted objections.
  • Improved handling of ambiguous references.
  • Ability to use imported page numbers as hyperlink references.

June 2023 update

It’s been a long time since our last update! In the past year we have been focusing our efforts on two main areas:

  1. Electronic Casebook module. This tool is specifically designed to create hyperlinked, paginated electronic casebooks (and print versions) in accordance with the Senior Courts Civil Electronic Document ProtocolIt has been in production testing with various firms since early 2022, and has successfully produced dozens of casebooks. We are now rolling out this feature to all customers – another announcement will follow soon!
  2. Performance. As projects continue to grow in size, and new features and options for managing them are added, we have worked hard on optimising performance. In particular, for very large projects (with 1 million+ documents) we have made significant improvements in both user experience and back-end processing. We are also moving to bigger and faster servers to handle the ongoing increases in user numbers and project storage.

Here’s a quick summary of some of the changes since our last update.

New features

  • Integration of the Electronic Casebook module. This lets you directly copy documents from your discovery project into your casebook project, and keep track of which discovery documents have been nominated.
  • Restricted user access: the ability to restrict the documents that specific users can see – by repository, folder or virtual folder.
  • Administrator-only repositories: to restrict access to certain repositories to administrators only.
  • Ability to download a marked-up “diff” when comparing documents in the “compare documents” tool.
  • Support for WEBM video files.
  • Search criteria for confidential categories.

Improvements

  • Numerous improvement performances, including major improvements for very large projects (1m+ documents).
  • Smarter detection of inline images in emails, resulting in fewer “junk” image attachments being extracted.
  • Easier selection of parties in search tool.
  • Better handling of huge text/CSV files.
  • Improved display of pre-formatted plain text files.
  • PDF documents with closed popup annotations are now flagged as having hidden text.
  • A previously-unsupported PDF annotation type is now included in text extraction.
  • Improved search-hit highlighting of non-standard characters.
  • Improved search-hit highlighting for wordlist results.
  • Improved handling of invalid EXIF data in image files.
  • Improved detection of different kinds of Outlook meeting notifications.
  • Fixed rare pane resizing bug.

In response to increased demand, we have also expanded our training capacity. We are frequently running training sessions for firms around the country. We provide free training for our customers, so if you would like to arrange a session for your firm please contact us.

Thank you as always to our valued customers for your support and feedback.

February 2022 update

Due to an extremely busy second half of last year, it’s been a while since we posted an update. We have continued to roll out new features and improvements, here are some highlights from the past 6 months. Thank you as always to our valued customers for your support and feedback.

New features

  • When searching for words or phrases in a PDF document, the plain text of the PDF is shown with search terms highlighted. This makes it easier to find search terms within PDFs, by avoiding the limitations of highlighting search terms directly in a PDF viewer.
  • Added “next hit” and “previous hit” navigation buttons to text search results, to allow easy navigation of search hits within a document.
  • Automatically scroll to the location of the first search hit in a text search result.
  • Added the ability to download a plain text version of supported documents (which is most text-based documents).
  • Added a way to add a document to a virtual folder directly from the document details page.
  • Translations: option to include translated versions of documents in bundles.
  • Ability to export metadata (document properties) reports for any set of documents.
  • Option to export metadata in columnar format.
  • Ability to set custom colours for tag & virtual folder badges.

Improvements

  • Further improvements to highlighting of search results (better handling of wildcards, groupings & proximity searches).
  • Automatic duplicate removal: when uploading documents, if the destination folder already contains a visible, top-level exact duplicate of the uploaded document with the same filename, the upload is skipped. The system has always automatically hidden exact top-level duplicates in client repositories (regardless of filenames). This new behaviour applies to all repositories, and is intended to prevent the not uncommon situation of the same documents being inadvertently uploaded multiple times to the same folder.
  • More options for restricting user access to specific folders or repositories.
  • Improved metadata (document properties) reports (exporting document metadata to Excel).
  • Improved positioning of look-through redaction boxes of rotated documents.
  • Exhibit lists can now use standard list stamping, in addition to affidavit exhibit note stamps.
  • Improved prioritisation of document conversion.

July 2021 update

After a series of rolling updates, here is a selection of new features and improvements:

New features

  • Look-through redactions: when you create a redacted version of a document in LawFlow, the system now additionally creates a separate look-through version with translucent redaction boxes, so you can see behind the redaction. This makes it easier to review redactions, and provides a convenient way to show permitted parties what has been redacted.
  • Redaction page tracking: when you create a redacted version of a document in LawFlow, the system now records which pages have had redactions made on them. This can help with reviewing redactions, and being able to advise other parties which pages in a document have been redacted.
  • Faster redactions: the redaction process now only processes pages that have one or more redactions. This provides a major performance improvement for longer documents where only a few pages are being redacted.
  • Smaller redactions: due to an improved process, redacted documents are now typically significantly smaller in filesize than previously.
  • Translation support: this release adds the ability to attach a translation (either in plain-text, Word or PDF format) to a document. For documents with attached translations, you are able to view the translation together with the original document, and search on the translation text. Further planned enhancements will allow the translated versions to be included in bundles.
  • When files are extracted from a zip or 7-zip archive, the File Created and Last Modified dates are recorded against the extracted documents.

Improvements

  • Significant overall performance improvements for large projects.
  • Searching: the ability to exclude documents matching one or more word lists (previously, searches could only include documents matching one or more word lists).
  • Searching: improved highlighting of search results.
  • Improved performance of the Discovery tab’s “Document type” box with a very large number of document type.
  • Improved performance for bulk linking and unlinking of documents to email addresses.
  • Improved performance and integrity checking of zip file extraction.
  • Significantly improved performance of 7-zip file extraction (especially for very large, multi-GB 7z files).
  • Automatic removal of Mac OS “resource fork” junk files from uploaded zip & 7-zip archives.
  • Deleting pages from a PDF now removes all bookmarks (outline entries) from the PDF.
  • Processing of barcode-separated batch scans (automatically splitting of scanned PDFs with separator sheets) now supports image-compressed PDFs.
  • Better handling of zero-byte files (zero-byte files are usually the result of incorrectly or incompletely processed files).
  • Parent production number column added to Excel discovery list with “extra columns” enabled.
  • Improved validation of date values when importing discovery information.

March 2021 update

New features

  • Browsing documents by email address now allows further browsing by role (From, To, Cc or Bcc)
  • Linking email addresses to parties (authors & recipients): added an option for controlling whether to exclude CC and BCC recipients.
  • New discovery review checklist checks:
    • Redundant emails;
    • Incomplete review tasks;
    • Non-privileged, non-redacted emails sent from legal domains;

Improvements

  • A discovery review checklist can now be generated for a specific repository, folder or other set of documents, instead of for all client documents in the project.
  • Exporting document information includes review task notes.
  • Adding and removing of review tasks for documents is now recorded in affected documents’ history.
  • Improved apparent duplicate email detection.
  • Improvements to email disclaimer detection.
  • “Hide junk image attachments” tool now includes PDFs (previously this feature only applied to native emails).
  • Improved performance when processing PDF files with large number of fields.
  • Processing of barcode-separated batch scans removes superfluous separator pages at the end of a batch scan.

January 2021 update

Happy new year! We have a big todo-list of new features and improvements updates that we are already busy working on for 2021. In the meantime, here is a summary of recent new features and improvements.

Thank you as always to our great customers!

New features

  • Virtual folders that a document is in are now shown on the document’s Info tab.
  • Hidden duplicates of a particular document can be viewed via the Related tab.
  • The search results page show the number of duplicates excluded by the “exclude duplicates” option.
  • Inclusion of custom fields on “export to Excel” and discovery list reports.

Improvements

  • The “Discoverable” workflow category is now separated into “Discoverable – partially reviewed” and “Discoverable – fully reviewed”.
  • Improved ability to redact PDFs containing some internal errors.
  • Improved ability to OCR PDFs containing some internal errors.
  • Improved ability to detect OCR requirement for PDFs containing numerous small images.
  • Improved handling of trivial OCR results.
  • Improved handling of blank pages and non-text documents in compare documents tool.
  • Added part-privileged redaction check to Discovery Review Checklist.
  • Importing notes excludes duplicate notes.
  • Improved handling of Zip files that otherwise cannot be opened due to over-length internal file paths.
  • Better layout of discovery review tasks and notes.
  • Incomplete tasks are displayed before completed tasks.
  • Separation of incomplete and complete tasks on Discovery page.
  • Various performance improvements.

Major system upgrade complete

We’ve been busy labouring this Labour Day weekend, to carry out a major upgrade of the LawFlow servers. This will allow us to keep up with the significant growth in our user base, and the ever-increasing average size of discovery projects.  It will also enable us to roll-out numerous new features that we have planned for the coming months and into 2021.

Thank you as always to our awesome customers! Keep an eye on this blog for news about further updates.

LawFlow – New Zealand’s leading e-discovery solution

New OCR system

A feature of our September update of LawFlow that we are particularly excited about is our new OCR system. While LawFlow has always provided OCR capability, the September update implements a new custom system that we have been developing for some time, incorporating leading OCR technology, and tailored specifically for e-discovery.

Key improvements of the new OCR system include:

  • Significantly reduced lead time for uploaded documents to be OCRd.
  • More robust processing due to improved identification and handling of corrupt or malformed PDFs.
  • The ability to OCR more DRM-protected PDF files (some DRM restrictions may still prevent specific PDFs being processed).
  • The ability to perform OCR on detected image-based pages within an otherwise text-based PDF. This can occur where text images are inserted into a natively-generated PDF, or where text-based and image-based PDFs are merged into one.
  • Improved detection of document number stamps (frequently applied in e-discovery) that otherwise prevent a PDF (or certain pages of it) from being a candidate for OCR.
  • Confidence scoring of OCR-processed documents.
  • Detection of specific pages with low confidence scores.
  • Separate processing of longer documents in order to reduce delays in processing smaller, faster-processable documents.

As with our previous OCR system, the new system is not cloud-based but is fully hosted on our hardware right here in New Zealand. This means we do not send project data to a third-party or overseas for OCR processing.

OCR accuracy

As always with OCR, accuracy depends heavily on the quality and characteristics of the input. In general terms, well-scanned clean black-and-white block text with standard fonts & font sizing is likely to produce a relatively accurate OCR result. Conversely, lower-quality scans, non-standard fonts, stylised/coloured layout, marks on the image, etc will likely result in lower accuracy.

However even with high quality input, there can still be inaccuracies – a “good” OCR accuracy rate is considered to be around 95-99%. There can also be complications and inaccuracies in reconstructing the OCR text into sentences or paragraphs. This should be taken into consideration when searching or otherwise using OCR-generated text.

OCR processing

The outline of the new OCR system’s basic processing stages for each document in a project (which remains similar to the previous system) is as follows:

  1. Determine whether the document is of a type suitable for OCR (PDF or supported image files). If not, do not attempt OCR.
  2. For PDF documents, if every page of this PDF file already contains detectable text above a de minimis level (after attempting to exclude any detected document number stamps) then do not attempt OCR.
  3. Run OCR process on the document (for PDFs, do this only for pages excluding any with detectable text above the de minimis level).
  4. If the OCR process detected any text, convert the document to a searchable PDF with the OCR text applied.
  5. Index the OCR detected text (for use in searching).

If you have any questions about our new OCR system or how to handle OCR text in your discovery project, get in touch with us and we’ll be happy to help!

September 2020 update

We are continuing to work on a lot of new features & improvements to our LawFlow e-discovery system, and will continue rolling them out progressively. Here are highlights of the latest update.

New features

  • New OCR system to improve the performance, robustness and usefulness of the OCR process.
  • Hide emails by address tool (similar to “hide emails by domain sender tool” in the previous update). This allows hiding (or deleting) of all emails sent from selected email addresses.
  • Ability to exclude a saved search from within another search. This makes it easier to create searches that exclude documents meeting specific criteria. So if you want to do a search such as “All documents in criteria A, but excluding any in criteria B“, you can create a search for criteria B, save it as say “Criteria B”, and then create another search with criteria A that also excludes the “Criteria B” saved search.
  • Option to toggle additional columns (author, recipient, etc) in the left-hand slide-out pane in Details view.
  • Ability to link multiple documents to chronology events at the same time via the tray.

Improvements

  • Usability improvements to “link email addresses to parties” tool.
  • Quick link for adding users added to home page.
  • Improved detection of specific watermark text on PDFs.
  • Improved no-content detection for vector-based PDFs.

Thanks as always to our great customers for your support and feedback.

August 2020 update

New features

  • Option to hide (or delete) emails from the same sender when hiding a document. This is useful for quickly removing ‘junk’ emails, e.g. spam, newsletters, etc from a particular sender when hiding one document from that sender.
  • Hide emails by domain sender tool. This allows hiding (or deleting) of all emails sent from selected domains. This is useful for quickly removing ‘junk’ emails, e.g. spam, newsletters, etc from multiple particular senders at once.
  • Search function added to redaction tool – this allows basic searching of text in the document being redacted.
  • Ability to view the source duplicate of a list document when using duplicate placeholders. This makes it easy to see the source duplicate without having to navigate away.

Improvements

  • Improved performance of the redaction tool for very long documents.
  • Improved performance for redaction processing (i.e. generation of the safely redacted version of a document after confirming the redactions to make). Redacted versions of documents are now created over twice as fast as previously – typically now less than 1 second per page.
  • Improved apparent duplicate email detection, including smarter algorithm for handling emails-as-attachments.
  • Improved responsiveness of resizable panes.
  • “Merge parties” tool now includes parties automatically created for mapping files (i.e. authors/recipients automatically created when importing another party’s document list information).
  • Improved extraction of non-standard EXIF metadata properties from images.
  • When viewing a list, “dupe” badge to indicated duplicates now only shows for duplicates on the same list.
  • Various performance improvements.