One Minute Problem Summary
This medical record runs in a Google Chrome browser on tablets and desktops in each clinic. These devices are connected to a router that transmits via 3G/4G cellular and a VPN to our central server. The router, cellular, and server can slow down (during peak hours), stop working when the power supply is unreliable, or malfunction.
We want to build a more robust system that can allow a clinic to continue to function (and remain locally in sync), even if the cellular, central server, and ideally even if the local router malfunctions.
The clinic system would then re-sync with the central server when the connection is restored (typically 5 minutes to a few hours later)
Our users are also merciless about system speed, and the only way I see to answer this need is to build a system using offline first methodology so we can take advantage of pre-caching the patient record.
1. The problem in detail
The hard part of this project is that a patient’s chart must remain synchronized between all users within a clinic, and that clinic must also sync as often as possible with a central server.
This is different from most offline projects were each individual device can store their own cache, and then the individual device can re-sync with the central server when connectivity is restored.
The reason that the devices within a clinic must remain in sync is that a patient will be seen by different clinicians during one encounter. For instance, a patient may first see the registration desk, then nurses, clinical officers, consultants, nutritionists, social workers, pharmacists, etc. Each subsequent encounter requires the information from the prior person.
2. Current Infrastructure
2.1. The electronic medical record runs inside a Chrome Browser. This renders an Angular front-end.
2.2. Devices include: tablets, chromebooks, and desktops.
- Tablets were the primary devices used given their low cost, and ability to operate for many hours without power
- Laptops/Desktops are used at some sites with more reliable power
2.3. The clinic devices are connected at the site via clinic router and access points.
2.4. The router is supplied via a backup battery that is charged by solar. The power may go out throughout the week at remote sites for several hours at a time.
2.5. The router has a SIM card in it connected it to the 3G/4G network to create a secure VPN from the remote clinic to the central server. A few sites outside of cellular networks are connected via VSAT satellite internet.
2.6. The central server runs multiple applications on virtual machines.
3. Current challenges
These challenges are listed from the device to the server.
3.1. Global system speed
Users want everything in the system to be lightning fast. Many users have never used a computer before and they benchmark the system’s speed to that of their native phone apps.
3.2. Cellular network
Sometimes the cellular connection becomes dodgy during periods of heavy rain, or may have intermittent connectivity issues in remote locations.
Outages of a few seconds are not uncommon. These cause delays in system performance and irritate users.
Outages of several minutes to hours are unpredictable and bring the clinic to a halt.
3.3. Clinic router and access point malfunctions
At times the router or access point in a clinic malfunctions and must be replaced. This brings the entire clinic to a halt sometimes for up to several days until a new router and technician can be deployed to the remote rural site.
Battery failures: the clinic battery may not be charged if the generator or solar was not used properly. This also contributes to router failures.
Installing a cellular SMS chip into hundreds of individual devices as a way around a router is prohibitively costly - which is in part why a router is used at the sites to provide internet.
3.4. Mobile Clinics
If possible, it would be nice if devices set up at a ‘pop-up’ clinic could ‘remain in sync’ without the need for a router or modem. This would simplify the amount of hardware / batteries / etc to be transported to the mobile clinics.
3.5. Remote servers (at clinical sites)
At this time remote servers are not placed at clinical sites. In the past they were used in a slightly different capacity. Remote servers pose several problems, that include
- Added cost in order to scale this system to new clinics
- Insufficient network IT personal to maintain and repair multiple remote servers
- Power at rural sites is unstable, and the solar powered backup battery system is insufficient to power a small server at the site in addition to router and tablets
- Merging & conflicts are more difficult to manager when the system is used offline. A real-time cellular system is able to maintain version control much easier.
- The software application runs on multiple servers / VMs at once. This requires a data centre to coordinate, and is not possible at multiple distributed sites.
3.6. Central server slowdowns
The central server can slow down the network for a few reasons:
- Peak Usage: during peak hours around 11am the central server experiences heavy loads and the system slows down.
- Expanded user base: as more users are added to the system, further financial resources will be needed to handle the load.
- Central server crashes: fortunately this is very rare, but when it does occur the entire network of clinics stops working.
4. Possible offline solutions
We identified four different solutions to help provide offline support.
Each has its own benefits and problems.
Ultimately, we are looking for something that is reliable, scalable, and very simple - ideally something that requires not on the ground IT support.
Option 1. Local server at each site
Devices would interact with this local site server the same way they did previously with the central server. The local server would then synchronized to the central server.
In many ways this is the least desirable solution, all it does is create new problems:
Problems:
- If the local server crashes, the clinic crashes
- It adds another network IT device to maintain
- Powering the server may be difficult
- There still a synchronization / data reconciliation issue between local servers and the main server.
Advantages
- There is ‘one source of truth’ as the tablet devices talk to the local server to identify information.
Option 2. A progressive web app (PWA) (‘offline-first approach’)
The tablets and devices use their local cache on each device to store information. Each tablet and device talks to the central server as they currently do
Problems:
- This does not create a solution where multiple devices in clinic can remain in sync at once. From a workflow perspective this is a major problem.
- How do we use ‘server side’ applications when we are running the system using the offline mode on the device? [not impossible, though needs some thought]
Advantages
- Progressive web apps are becoming mainstream and well supported
- Because data is cached locally to the device by default, use of the website in online or offline mode is a lightning fast user experience. (This is the reason that all websites should be built with an offline-first approach - even in high connectivity environments)
- An offline first solution will reduce central server requests, and this will help address issues of peak usage and added users base.
Option 3: Option 1 (local server at each site) + Option 2 (progressive web app / offline-first)
This is a hybrid of installing a local server and building a progressive web app.
It brings with it some advantages over doing only Option 1 or only Option 2. But still retains the problem - if the local server malfunctions - the tablets in clinic are unable to sync - and this is a major issue from a workflow perspective.
However, the solution remains rather inelegant in the amount of data syncs required, and the local server is a non-scalable solution.
Option 4. Offline peer-to-peer system (offline mesh network)
The devices at a local clinic are able to remain in sync with each other, and this local clinic is able to synchronize when itself when possible with the central server.
At times a local device such as a desktop/server could be installed into the local mesh network to add greater data capacity to the local instance, but this is not required.
This option also includes Option 2 (Progressive Web App / offline-first architecture).
Problems:
- This seems really hard to build. [though this is not impossible :)]
- How is data reconciled & synced?
Advantages
- All devices in clinic can remain in sync
- No additional major hardware / network IT support required
- Impervious to network / cellular / server / power problems
- Benefits of an offline first approach
- Able to work at very small clinics to large clinics
- Less queries to server, less data downloaded, cheaper cellular usage and reduced server load
5a. Levels of Offline EHR Functionality vs0.1
I made up an eight level scale of increasing offline functionality. The thought is that as healthcare offline apps achieve better offline functionality they will move higher up this scale.
Each level includes the functionality of the previous levels
Level 1: Offline data viewing
When network connection is lost, data can still be viewed within the application.
Level 2: Single device, two-way sync
includes previous level (able to view some data offline). PLUS data can be captured offline on a device and synchronized from that device to the central server The device does not have to synchronize with other devices.
eg Retrospective (after clinical encounter) entry of paper file into laptop or desktop: A patient’s record can be searched for, or a new patient can be created. Forms are available to enter information about the patient
eg Collect patient information at a patient’s house on phone or tablet: A limited amount of historical patient information is available on the device. The device synchronizes ideally several times a day, or whenever connectivity is around, with the central server so that progress can be tracked and data not lost.
Level 3: Multi-device offline network with small sync history
eg Multiple devices at screening clinic: Multiple devices within the clinic are in sync: eg registration / blood pressure screen / cervical cancer screen / mental health screen / diabetes screen. When possible (during or after) the clinic, the information from the screening clinic is synced to the main server. If required or easier from technical side, the parts of a patient’s chart that are used at the clinic level, can be ‘locked out’ at the central server when in use by the clinic.
Level 4: Multi-device offline network with medium sync history
eg Multiple devices at small dispensary: View more comprehensive past patient history. More users, such as pharmacy, within clinic’s real-time network.
Level 5: Multi-device offline network with large sync history
eg Multiple devices at large clinic: Extensive past patient history available in offline state. A device at the clinic may help add extra ‘offline’ capacity to the local network to hold more data. No clinic chart lock-out, as users in different clinics require access to chart at same time. More advanced methods to merge conflicts.
Level 6: Multi-device offline network with large sync history + Real time ancillary services
eg Multiple devices at large clinic with real time labs: Continuous sync of real time ancillary services. The patient chart within clinic requires real time syncing with the central server - as new information will be coming in during the clinical encounter - such as data from the lab, imaging, insurance desks, etc (that are located outside the ‘clinic mesh network’).
Level 7: Hospital
somehow an entire hospital would function with intermediate connectivity.
Level 8: County
A peer-to-peer offline health data network throughout a country. I haven’t thought this through yet.
5b. Amount of patient historical data in offline mode
Example of the levels of data one may want to sync to be available when a device is offline.
This is very rough, vs. 1
The numbers represent the number of years of data to sync in each category from 1 to 5. The number 6 is meant to represent, ‘all data’ available.
6. The offline solution
We are still in the research and experimentation phase - looking for answers, suggestions, and advice.
Sincere thank you to everyone who has helped contribute to solving this problem. I’ve been really impressed with the suggestions. Many things in the area of offline mesh networks that I thought were impossible, too hard, or used underdeveloped technologies are actually much more advanced and robust than I previously realized.
How can this be done? The details for another post… Though, I can’t exactly promise when that will be.
Special thanks to Dr Thomas Mwogi for his key insights in framing this problem and its possible solutions..
Further Reading
I highly recommend reading these two articles from the team at Simple.org that show the benefits of offline first apps in clinic.
What we are learning by creating an ultra-thin EMR - Sept 2019
Offline-first apps are appropriate for many clinical environments - Jan 2 2020