The result was S1000D, a document specification for technical publications. Its XML hierarchy tags all parts of a document, such as chapters, notes, diagrams, tables and bulleted lists. The standard defines schemas for sections of a manual like maintenance procedures, descriptions of parts and illustrated catalogues. Once digitized in this manner, documents can be reproduced in any format. This would also support any new technology that’s adopted, such as IoT and AI analytics.
Advances in optical character recognition (OCR) have made conversion of paper documents into an electronic form superfast. OCR software can make readable documents out of scanned images and reproduce them in XML or Excel. But for industries like aerospace and defence, these are not useful until they’re tagged and structured according to S1000D specifications.
That’s where Bengaluru-based startup Stelae Technologies has built its expertise. Its product, Khemeia, has smart algorithms to recognize patterns and identify content elements in documents. The S1000D standard embedded in the software can automatically determine which content element goes under which tag in the XML hierarchy or which section of the document goes into what schema. And it does all this much faster and more accurately than the usual manual methods, explains Stelae co-founder and CEO Aruna Schwarz.
Stelae came into US defence contractor Raytheon’s radar at an S1000D user conference in the US last year. Raytheon Eagle, a logistics and analytics software system, supports the entire life cycle of Raytheon products from production to maintenance, both internally as well as for resellers, customers and others in its ecosystem. Eagle’s programme manager Jarom Walcott needed an elegant, automated solution to convert documents into S1000D instead of the ‘brute force’ methods that BPO-type companies used by throwing bodies at the onerous and time-consuming task. The discovery of Stelae’s technology led to several trips to Bengaluru for Walcott that culminated in signing a deal a couple of months back.
The hardest part was getting his legal team to come up with a new type of agreement because this is the first time Raytheon will be reselling a product from another company as part of its ensemble. The US company is used to signing deals with those who resell its products, but hadn’t been in a reseller’s shoes. This was new territory for the 98-year-old defence behemoth.
The use cases for applying AI to documents and deriving insights are growing rapidly with the likes of Google building the resources for it on the cloud. But before any of these can come into play, documents need to be available in a standard, structured form. This is the upstream area Stelae has been working in from its inception in 2002.
After doing an MBA in the UK and working with telecom companies, Schwarz spotted the opportunity in digitizing and structuring documents while working for a content management vendor in Paris. She launched Stelae in France in 2002 along with her CTO, Pierre Fraisse, who had a PhD in mathematics, and had worked in the publishing industry for two decades.
Their first major client was the French media company Lagardère, and they cut their teeth on digitizing complex fashion magazine content. The duo branched out into legal content which paid better. But Stelae’s big breakthrough in the manufacturing sector came in 2010 when Rolls-Royce adopted its technology. Soon the startup made inroads into the defence sector, and a major project in India prompted its shift to Bengaluru, with the company being registered in Chennai in 2012.
Selection into the Airbus Bizlab accelerator programme in Bengaluru in 2017 further concentrated its attention on aerospace and defence where legacy documents are a humongous problem needing rapid solution.
“AI technologies for predictive maintenance are not working because there’s the underlying problem of 80% of data being in documents," says Schwarz. “If you cannot extract the data and put it into a database, there’s nothing to analyze." That’s why a company like Raytheon needs Stelae at a more basic level of digital transformation before anything else can happen.
It’s the same reason why German rail company Deutsche Bahn recently signed on Stelae for a proof-of-concept project that kicks off on February 18 in Berlin. “There were two very cool AI companies competing with us in our space, which is automating rail documentation. But they didn’t address the problem," says Schwarz. “AI can do a lot of things like analyzing inventories, reducing downtime and so on, but Deutsche Bahn wanted a hardcore technology like ours at the outset to transform the infrastructure."
OPENING US DOORS
The S1000D standard into which Stelae is structuring Raytheon documents is one of many that the startup works with. There are variants for the railway industry called Raildex and shipping called Shipdex. With each client, while the core algorithms remain the same, Stelae adds new domain knowledge and algorithms. For example, Pierre Fraisse’s team will work with a new schema that Deutsche Bahn uses.
Even within Raytheon, there are other document standards apart from S1000D. In each case, the Stelae product recognizes patterns in documents and structures them. And it can do this in multiple languages, from Arabic to Russian and English, because the algorithms identify content elements and don’t have to get the meaning of documents.
Working with Raytheon and having its product shipped with Raytheon products to provide documentation support to the US company’s customers opens new doors for Stelae which has so far been more active in Europe. Stelae is now eyeing another huge traditional industry in the US, the oil and gas sector. The prestige that comes from being with Raytheon creates possibilities in adjacent industries.
Conversely, Stelae would help Raytheon potentially enter new areas such as the European railway industry or make inroads into the Indian market. It also makes a significant difference to Raytheon that its products will ship to customers with digital documentation support. That way the customer doesn’t have to go looking for a solution from a third party and this helps to build loyalty. That an Indian startup is the first with which Raytheon Eagle has built this level of partnership makes it a big deal for the software product ecosystem in Bengaluru.
Sumit Chakraberty is a consulting editor with Mint. Write to him at email@example.com