Geek Corner – Integrating Transaction Data into RunSignUp Analytics

You may have seen a new piece of data on your Dashboard Overview page – Transactions! As of Nov. 2, the data still shows $0:

Image-1

While it will be cool to see the real data in another day or so, for geeks, the interesting part is how we are going to show the data!

If you have been following along, we have built our own equivalent of Google Analytics for races. It tracks each click on your website and matches that up against things like Registrations and $ Transactions. It is a separate system built on the latest technology using a web service API Gateway, Lambdas to process, and a replicated, scalable AWS Aurora database.

We are adding features incrementally, and we are now adding Transaction data. To do this, we have a PHP script running in the background on the main RunSignUp servers to capture the transaction data from each and every transaction back to the spring of 2010. Yes, millions and millions of transactions are being analyzed and the data exported to the Analytics database via the API Gateway and Lambdas and stored in the Aurora database. This process has been running since Oct. 25 – yes, over 7 days to capture all of the data.

Once the data is over in the Analytics database, we will clear the caches, and transaction data will magically appear!

The next step is to align source information (email, Facebook Mobile, etc.) with the registration and transaction data. Once we have that working, we will do a similar data export of all the Registration $ and Donation $ so you will be able to see which channels are producing the most donations and which are producing the most registration dollars.

And this is happening many times per second.  Geeky cool…

1,000 Registrations per Minute

Bayshore Marathon opened this morning and along with other traffic put a load of 1,000 registrations per minute on our system.

Response time was 2.6 seconds with peak loads of over 6,000 pages per second.

Registration opened at 9:00 and there were 861 people signed up within 2 minutes – meaning they breezed thru the registration process.

Here are some pretty graphs below. We did not expand the system because we wanted to see what the behavior of the default config is under this load and what areas we should expand when we do our annual infrastructure refresh in February.

Browser Response Time:

Screen Shot 2016-12-01 at 10.02.03 AM.png

3 NGINX Servers (m4.large):

Picture1.png

4 Web Servers (c4.2Xlarge) (Note: we will be expanding these in February)

Picture2.png

Primary Database (r3.2xlarge), Backup, Shards (r3.large): (Note: the main database has plenty of capacity – need to enlarge or break up the shard in February when we do our annual refresh).

Picture3.png

 

 

DDOS – What a Distributed Denial of Service Attack Looks Like

We had a DDOS attack on RunSignUp today.  It lasted from about 4:00PM until 4:45PM when we were successful in cutting it off. It averaged over 1,000 requests per second.

screen-shot-2016-10-03-at-9-32-17-pm

The attack was looking for vulnerabilities like SQL injection. This slowed down the average response time on our website about a half second from about 2.7 seconds to about 3.2 seconds:

Screen Shot 2016-10-03 at 9.30.28 PM.png

We will continue to watch closely for additional attacks and do everything we can to mitigate any delays or issues. We are fortunate that our Amazon AWS infrastructure is so scalable and high performing and alerts us when these types of issues occur. We also know that Amazon is also working with us on addressing these types of bad actors on the Internet.

RunSignUp Photo Architecture

3We have begun developing our new Photo Platform for RunSignUp. There are two key pieces of new architecture that we are using as we build out this new platform, and they are cool for tech geeks.

As mentioned in the 2016 Roadmap, we are developing an open platform for photos for race websites. We anticipate integrations with leading platforms like MarathonFoto, Gameface and new innovators like Pic2Go. The platform will also support connections to photo libraries such as Flickr, Google and Instagram. In addition, we will be providing a mechanism to upload photos to be stored on RunSignUp. We will be providing multiple display options as well as pricing options ranging from free, to pre-buying access, watermarking with sponsor logos, etc. We will also allow for tying photos to results in various ways, which will allow for interesting promotional capabilities.

There are two key areas we are using new technology in this platform – storage of the MetaData (race, bib, name, time, location of photos, etc.) and the processing of photos especially if your race is using RunSignUp to store and display the photos rather than a partner like Instagram or GameFace or MarathonFoto.

We have selected Amazon AWS ElasticSearch for our Meta Database. This will allow a very dynamic set of metadata to be stored and searched quickly and in a scalable manner. This will allow us to create RunSignUp applications like a link in a Results Listing to images of each runner for example.

The second part of the architecture is Amazon AWS Lambda. This is a new way to deploy applications in a Serverless mode (which is growing in popularity and represents the next generation of computing). We will use Lambda for a variety of functions – for example storing the Metadata when an image is uploaded, resizing photos, watermarking photos, sending photos to Google Vision for automated bib tagging, etc.

There will be a number of Lambda functions we will develop – like create thumbnail image, set watermark, scale image for mobile display, etc.

The RunSignUp Photo Platform is a very large project, but we will be able to create a very scalable, robust system much more quickly by using this new generation of technology. We can’t wait to bring it to you!

 

Infrastructure Improvement Summary

fast-ronWe try to improve our infrastructure in Q1 of each year to make sure we do not build up technical debt and stay on top of the most current trends in technology. We have four goals when we look at this each year:

  • Improve Availability – reduce the chance our systems will go down, and in the event of real issues that we recover in as automated and quick way as possible.
  • redundancyImprove Speed – it is well know that a tenth of a second is meaningful on e-commerce sites. We want to make your race registration as fast as possible to give your participants a great experience and get them thru checkout as quickly as possible.
  • Improve Security – make sure your data and participant data is secure.
  • Improve Scalability – make sure if there is a crush of people signing up for races, that we handle the load without performance loss.

We have done a wide variety of changes the past 2 months we have been working on this. Probably the most visible and significant are the  0.5 Second page speed progress we have made:

January 11-17, 2016:
Screen Shot 2016-03-23 at 2.03.25 PM.png

March 14-20:
Screen Shot 2016-03-23 at 2.03.02 PM.png

Remember, this is a blended rate across Desktop and Mobile (about 52% Mobile Phone access). While a Half Second difference does not seem like a lot, studies have show it can make a 7% difference in conversion. Reference 1, Reference 2. And perhaps more importantly, page speed can affect your Google Ranking – so RunSignUp speed is one of the reasons why race websites on RunSignUp rank so highly.

Screen Shot 2016-03-10 at 11.19.37 AMHere is a list of the significant improvements we have made:

  • Aurora Database Upgrade – This is probably the most important upgrade we made since it improved speed, availability, and scalability. The chart on the right shows the decrease in time for some of the database calls that we make – dropping from an average of 40 milliseconds (0.040 seconds) to less than 10 milliseconds.
  • Screen Shot 2016-03-10 at 11.20.11 AMHardware Upgrades – we upgraded a number of our hardware components to faster, more modern equipment. This was largely responsible for the reduction in the web server response time shown on the right.
  • Optimized Page Load Time – we made a number of changes, like optimizing jQuery library downloads, reduces CSS size, asynchronously load Facebook, removed
    Screen Shot 2016-03-10 at 11.30.36 AMAddThis share and replaced with better share options, and more. This led to drops in page load, which is especially important for mobile users on slower connections.
  • Optimized Database Backups – the move to Aurora gives us database backup on a per minute basis. In addition, we added the capability to store permanent snapshots of the database.
  • Improved Availability for Batch Jobs – we created better mechanism for running our routine batch jobs in the event the main batch job server is down.
  • Failover for Read Replica Database – Providing higher reliability in the event a read replica database becomes unavailable.
  • Software Upgrades and Logging Improvements – we updated to current versions of our core software components and added better long term storage for logs.
  • Upgraded Load Balancer Error Handling – we now detect issues in the load balancers better, and have a smoother failover capability to users should see no disruption in service if there are issues at this tier.
  • Improved Participant Report Performance – we made changes to the database and queries to optimize one of the most commonly user reports – the Participant Report. On fast connections you can now see sub-second response on reports even when you have over 20,000 participants.
  • Security Improvements – we will not talk publicly about these.

We see our investments in modern infrastructure to be a core value (along with processing money efficiently and providing features to improve your race) we can provide to races. Most of these improvements are beyond the capabilities of most races (and even most race registration companies), but they are critical for providing the best platform for you and your participants.

RaceJoy Infrastructure Upgrade

RaceJoyLogo_Ready2-2048We have also done a major upgrade of the RaceJoy infrastructure on Amazon AWS – more than doubling the size of the database server and upgrading to the latest version of the database. This will mean our standard environment now handles the largest loads we have had without us having to do special upgrades over a weekend.

Making sure our infrastructure is highly scalable, secure and robust is important as more and more races depend on our technology to make their event special for their participants.

More Performance Improvements

Fast RonWe continue to do our annual big infrastructure update. Improvements to date have decreased average page load time across all devices (over 50% are mobile phones) from about 3.2 seconds to 2.85 seconds. While that does not sound like a lot, that is 980,000 seconds (11 days and 8 hours) of wait time last week across the 2.8 Million Page Views on RunSignUp.

Here are the latest updates:

  • Switching from Google’s Content Delivery Network to CloudFlare’s CDN for jQuery libraries. This reduces the time it takes to load these libraries into browsers.
  • Combined <link> tags for Google Fonts to improve performance.
  • Reordered our HEAD script block to optimize performance.
  • Load the AddThis share components after the page has loaded. We will likely replace these in the coming days to further optimize performance.
  • Updated Facebook to load asynchronously.
  • Cleaning up old CSS from our CSS files to decrease load size on each page by 20-25%.
  • Moved some database queries from the main database to a shard.

These, combined with other improvements have dropped our average page load time