There is a myth that the idea of off-line data collection/manipulation and on-line data systems is easily solved through "synchronisation". This is in fact a complicated operation, in which the more complex a data system is, the higher the chance for "synchronisation" to become a nightmare. Let's take a little scenario to demonstrate this.
Imagine you have a form on an off-line tablet which asks about an event with the response values 1 = less than 6 months, 2 = 6 to 12 months and 3 = more than twelve months. After the data collection project has been running for a little while, the researchers find that more precision is required and will change the first question to have a code value of 4 for "0 to 3 months" and 5 for "3 to 6 months". This will be the version 2 of the questionnaire. Those people who start using it will send back correct and more precise data – a good thing indeed. The main database expects to see values 2,3,4 and 5 now. For the people still using version 1 of the questionnaire, what does the database do with the now obsolete value 1? Should there be a special recoding process that must be programmed into the data upload module? If so, what do you recode 1 as? As 4 or 5? Imagine soon after that there are text changes for a different question. This third version can possibly change how someone responds to the question compared to earlier versions.
Yes, "synchronising" data coming from forms CAN work in simple cases where new questions are added, but as soon as someone adds precision to a form, this invites coding problems that quickly run out of control. Simply "synchronising" the off-line collected data turns out to be “not so simple” after all. This common scenario reveals problems of one type of synchronisation problem. Other problems arising from the "just synchronise" myth include problems of allocating resources incorrectly because one can only make allocation decisions based on data from the server, not soon-to-be-uploaded data sitting in a tablet somewhere.