Syncing up external Filemaker data with Rails

One of the web apps I work on leverages a Filemaker database. My co worker Brent the expert Filemaker dude, and I had a serious problem to tackle - how to get Rails and Filemaker talking to each other.

Now Filemaker has iffy SQL standard support and we didn’t necessarily want our web app to run live from the Filemaker database. We spent some time looking at making an ActiveRecord adapter for FM but really at the end of the day it was too much like hard work.

There are 3rd party solutions out there that can do this stuff for you, but if your needs dictate that they won’t work - our experience may help you. I’m going to write up the theory behind this which has been live in production for a good 6 months + and in the coming weeks will release some code acts_as_syncable or similar…

First off we wanted it occassionaly connected, one way synchronising. We had an installed instance of Filemaker Web Publishing engine so we could expose our data as xml views. We used a background rails (backgroundrb) process to query Filemaker every 5-10mins and ask for changed data, pull that data using REXML, and populate/update on the Rails side using ActiveRecord.

Here’s the mechanics…

1. Each object, needs a last_modified timestamp, whenever an update to the object that affects the data being synced, this last_modified timestamp needs to be updated (for all intents and purposes create an updated_at column on the filemaker side).

2. Next on the Filemaker side create a layout that exposes that data when queried via Web Publishing, as an xml result. We’re going to do two different queries so your layout must support these - as per note ii.

3. On the Rails side we need to add two models, and extend the models we want synced. First I created a SyncLog model which will track when syncing occurs (just for informational purposes), then a ClassSyncLog, which will track the models synced, when they were synced, the last timestamp and id of each model synced.

4. For each model on the rails app side, we need a number of properties:
- The corresponding id field name from Filemaker
- A hash listing each rails field and it’s corresponding name on the Filemaker side (we’re assuming that these will be different, in our case they were wildly different)
- Some call backs, pre process sync, and post process sync if you need to do stuff before/after syncing.

5. We need a model that will actually do the work of syncing the objects. This is the involved part of the process that does the grunt work of syncing so pay careful attention.

5.1 Pass in a list of Class names, for each class name do the following
5.2 Query filemaker using Net::HTTP from the last updated_at timestamp and see if any records have been updated since then, if so move on to 5.3 or go to the next class (well… almost there’s a trick here **)
5.3 For that returned result set from Filemaker (from the HTTP Raw data returned), fill an REXML object from the returned xml file. For each of the records in the result set do the following
5.3.1 Find or create the rails side object from the corresponding id in the resultset, check if you can sync it per the pre process rules if you have any.
5.3.2 Iterate over each of the fields in the result set, see if it matches a field in the field mapping on the rails class you’re syncing (4). If it does update the current object from the result set.
5.3.3 Finally save the current object and move onto the next record
5.4 Now before moving onto the next class you need to save a ClassSyncLog record, with the last timestamp of the last updated record, and the last id. So next time you have a starting point to retrieve the records.
5.5 Move onto the next class
5.6 After completing all classes write out a log entry to say it’s been done.
6. Wait 5mins… and repeat.

Of course there’s some caveats.

i. First is if you’re dealing with 100′000’s of records you do not want to grab just one resultset back - you’re going to run out of memory and/or have problems with getting the full stream of data. So we get the result set query into groups of ~ 2000.

ii. Related to this, lets say you have 100′000 records updated at 10:10:01 AM. The first request is going to be for all records greater than the last updated time, for example 9:30AM. So you’d get 2000 in on your 100′000 records, the next time in you ask for > 10:10:01 am, you’re going to skip 98′000 records, so you want to do two queries
first:
a. Give me all records time = last_updated_time AND id > last_updated.id, which will catch the next 2000 records - repeat this one until you get no more results
then:
b. Give me all records time > last_updated_time - then go back to qry #a for the next repeat.

iii. You will have performance issues, this isn’t quick by any stretch of the imagination so be a little patient to work through the issues. We’re running massive datasets so we often see updates that take hours. Not great but it’s ok.

iv. Watch your memory, this can be a long running process so throw in ‘GC.start’ to clean up memory along the way. (eg every resultset retrieved)

v. To avoid ID collissions it’s strongly recommended you use the MySQL master-master auto-increment-increment + offset scheme, so you stagger your ID inserts between your Filemaker DB(s) and MySQL DB(s).

Again look for code in the coming weeks.. I’m wrapping our custom stuff up into an acts_as module for the (limited) audience this may help…

No Comments, Comment or Ping

Reply to “Syncing up external Filemaker data with Rails”

About

Rowan is a Product Development Manager, specialising in architecting, developing and putting web applications into production - in particular Ruby on Rails based apps. He lives in Toronto, Canada but speaks in a funny accent as he's originally from New Zealand. He's been working in the software and web business for over a decade. This blog covers Web Application development and deployment in the real world, dealing with topics from business fundamentals to Ruby on Rails, Merb, PHP, Flex, MySQL, Apache and more.

Read more ...

 

 

View Rowan Hick's profile on LinkedIn

 

Subscribe to my RSS feed