DataCapTree: a new chapter in Stichting HIV Monitoring’s data collection process

DataCapTree.jpgIn May 2014, the AMC announced the phasing out of the data entry system that Stichting HIV Monitoring (SHM) was using at the time. As a result, SHM’s plans to develop a new system were put on a fast track. After extensive development work and an intense period of building, SHM’s new data entry system finally went live on 5 February 2018! We spoke to those who developed and now work with the new data entry system, known as DataCapTree.


“Looking back, it was a successful project, with close collaboration between all the parties, during which many people developed new skills”, says Sima Zaheri, SHM’s deputy director and manager of the Data Unit. “It's a system that we’ve built ourselves and, as a result, meets our needs seamlessly. We've named the system DataCapTree, to emphasise the fact that it captures data using a structure based on decision trees.”

A non-standard solution

The new data entry system was built using the LogicNets Medical Decision Framework software. “This system allows organisations to model their own system within our framework. Just as Microsoft does with Word or Excel, we supply our system in which users can create their own content,” explains Jelle Ferwerda, LogicNets’ CEO. “The advantage of such a system is that you can configure the application yourself. Moreover, we are also constantly in development, which means the system is able to keep up with the requirements demanded of it. We knew from the outset that SHM would test the limits of the system, and this really has been the case. We’ve definitely got the most out of the system.”

Sima says, “One of the reasons for choosing the LogicNets system was the decision-support feature. Our previous data entry system was very time-consuming because data had to be structured during the data entry process. This has been avoided in the new system by programming decision tree to support the data entry. The main advantage of this is that we can now change content within the system to adapt to future developments.”

Building the system in house

The system has been primarily built in house by SHM’s data collectors and data quality staff. Peter Chao, LogicNets business consultant and architect at ICT Group says, “We trained and supported SHM’s staff in the building process. This means that SHM is now in control of all the necessary expansions and protocol changes, without needing assistance from us.” Sima adds, “Building the protocols well is extremely complicated. Therefore, because SHM’s data are quite complex, we decided it would be more efficient to train our own staff in modelling, rather than train external people to understand our work. SHM’s builders followed a one-week course prior to the start of the project and have been supported throughout the project by ICT Group. For example, each week we had ‘Peter’s hour’, during which problems could be discussed and resolved." Peter says, “The system builders are now able to do a huge amount themselves, but of course we remain available for any problems that they cannot resolve themselves.”

Thousands of decision trees

Tieme Woudstra, one of SHM’s in-house DataCapTree builders, says, “The main difference with the previous system is that we now work with query flows driven by decision trees. For example, if you enter an answer, the relevant follow-up questions will automatically appear. We didn’t have this feature in the previous system. Instead, all the questions would appear and, based on the protocols, you had to decide yourself which fields you should and shouldn’t fill in. Although translating all our protocols into decision tree structures was incredibly time-consuming, we have now managed to build thousands of decision trees.

“Another major advantage of DataCapTree is that data can be immediately checked for errors. If, for example, a data collector enters a medication start date that is later than the patient’s date of death, the data collector will immediately be alerted. This hugely improves efficiency, because it means the data no longer need to be checked for these things and the data in the database are cleaner.”

Testing, testing and more testing

Once the protocols had been built, they needed to be rigorously tested. This was done by a group of data collectors, who went through the protocols, testing for all possible scenarios. One of the data collectors who tested the protocols is Femke Paling, who explains, “The main aim of testing was to check that everything worked as should do. Before we started, we drew up test plans, so that afterwards we could check whether everything was in the data warehouse in the right format. We also tested numerous scenarios.” Sima adds, “Because the testing took a lot of time and we really wanted everything to be thoroughly tested before we start using it, we decided to launch the system with just the key protocols to begin with. The outstanding decision trees are being completed at the moment and we hope that the new data entry system will be fully functional by the third quarter of this year.” 

New data model for the data warehouse

A new way of entering data also meant developing a new data model for the data warehouse. “In contrast to the one used previously, the new data model has a relational structure, allowing us to create far more links between the data”, say Mariska Hillebregt and Anne de Jong, both data managers at SHM. “The historical data have also been migrated to this new relational structure. In addition, each answer option used during the data collection process has now been given a unique code. This avoids any overlap between tables in which the same codes are used with different meanings. These changes have simplified the process and also reduce the risk of errors.”

Using the new system

The system has been in use with the first set of protocols for two months now. Following a number of training days, the data collectors started using DataCapTree to collect data in February 2018 and are supported by the data collectors’ help desk. To date, experience with DataCapTree is very positive. “DataCapTree has significantly changed our work”, says Femke. “The decision-support structure means you no longer have to look up each protocol to know what to collect because this is now indicated in the system. This means that data entry is far less prone to errors. In addition, we are now asked to provide more information for many of the events. For example, instead of just noting that a certain diagnosis has been made, we are now also required to provide information underlying the diagnosis. This helps the data quality staff and the researchers to check and analyse the data, respectively. Switching to this new way of working required some getting used to, but the new system offers so many new possibilities and everyone is really pleased with it. Personally, I find it very clear and easy to use”.

Data quality

There have been changes for SHM’s data quality staff too. “In contrast to the previous situation, the quality control functionality is now fully integrated into the data entry system. This has given us far greater insight into the work flow. Everyone can now see how far along the data are in the quality control process; for example, have the data been verified by the quality control staff, or are they awaiting correction by a data collector. Moreover, any change made to a patient’s data after approval of the data is flagged, making it easier for us to check what has been changed and why. In addition, the new system allows us to plan our work and carry out administrative tasks”, says Monika Raethke, one of SHM’s data quality staff. “Thanks to the decision-support design, the quality of the data is higher. The system picks up most data entry errors, so we don’t have to run checks on data entry. This frees up more time to check the data for other issues”, continues Monika.  

Two-step verification

“During the project we received excellent support from the AMC’s IT services, ADICT. This was very valuable because ADICT have a great deal of knowledge and we already worked closely with them and are part of the AMC’s network”,  says Sima. Having previously worked on SHM projects, Aad Lehmann of ADICT already had a good understanding of the organisation. “ADICT supported SHM throughout the process and were involved not only in determining the scope of the system, but also in thinking about potential issues. We also manage the application that SHM uses for registering and de-registering patients, as well as the application for user authorisation, and we are responsible for ensuring external users have access to the appropriate applications. ADICT was involved in setting up secure access to these applications, including DataCapTree, which has resulted in a two-step verification process”, says Aad. “My colleague Mariska Marcelis managed the project within ADICT and ensured that all the groups in our department knew what their tasks were. The project has meant that ADICT’s role has changed. Previously we carried out mainly management and building tasks for SHM, whereas now we have more of a support role because SHM has now become far more autonomous”, continues Aad.

Efficiency and durability

Efficiency and durability were two key requirements for the system. “And we truly have fulfilled them! With this new system we can expand to new studies, but we are also able to adapt to the possibility of collecting more data electronically rather than manually. In addition, the process has become far more efficient because any unnecessary steps have been removed, the data collection coordinator can now manage the data collectors’ work better by highlighting priorities, and, due to the new data model for the data warehouse, our researchers can now work with real-time data”, concludes Sima.