How PublishOne Sets The Benchmark For Content Handling

EY (Ernst & Young) is a leading provider of insights and quality services to capital markets and economies all over the world. It delivers multi-sector expertise across four integrated service lines — Assurance, Consulting, Strategy and Transactions, and Tax. EY helps clients to capitalize on new opportunities, and assess and manage risk to deliver responsible growth.

In the Netherlands, EY’s Tax and Legal Knowledge division uses an internal database to integrate content from external providers and internal content creators. A huge proportion of that content comes from Wolters Kluwer. In fact, EY consumes around 100,000 republished documents every day.

Receiving a content stream of that size and scale requires a robust and reliable infrastructure. That’s where data specialists PublishOne come in.

 

Document and data consumption on a huge scale

In fields like tax and legal knowledge, published content must be accurate, authoritative, and error-free. That’s why EY requires a robust and reliable mechanism with the built-in ability to perform validation checks. But importing huge volumes of complex and regularly updated content – like legal commentaries and case law – is not an easy process to streamline.

Take this example. Let’s say you have 100,000 documents in your database, and 9,000 contain information that requires a technical change. First, in the interest of speed, you’ll want to avoid processing huge quantities of data by refreshing every single document in your repository. Otherwise, it will put unnecessary load on your system. Second, you’ll want a trustworthy way to roll out the updated information everywhere it appears.

Transparency over individual publications, and the information they hold, is also imperative. So too is the flexibility to handle data in a way that suits your business, and not the other way around. On top of that, there’s the need for clear oversight and control so that the current state of a product and the current state of a document within a product can be easily fetched on demand. What’s needed is a finely-tuned mechanism for figuring out which updates are needed, and where they need to be made. Enter PublishOne.

 

Faster and more efficient content deliver

In the mid-2000’s the late Johan van Oostveen – co-founder of Diskad – provided EY with a unique solution for downloading large data files. It’s a concept that many of us are familiar with today. Back then, it was relatively unheard of and not yet popularized. It was the ability to access data whilst it was being created – data streaming. 

This allowed EY to make on-demand requests for large volumes of data from their knowledge partner – Wolters Kluwer – packaged in a single delivery. One box of data. The data in this pipeline consisted of complex content in various formats, including the notoriously tricky XML, and more user-friendly PDF. The data was segmented into pieces – per article. Lots of small pieces of text and metadata, plus a layer that connects all the different elements together. 

This solution (and subsequent iterations of it) has been successfully running since 2010. Part of the reason for its longevity is due to its reliability. EY gets 80% of the data needed, with only 20% of the issues associated with alternative methods. In fact, there are rarely any issues with the data processing pipeline.

EY has been receiving huge volumes of data from Wolters Kluwer – through PublishOne – for many years. They have clear lines of communication and a relationship with PublishOne that has evolved over the years to that of good sparring partners on the topic of optimizing data flows.