Data scraping

n this article, we will shed light on data scraping and Grip’s approach to this subject.

  • What is data scraping?
  • Grip’s position
  • What can I do to minimise scraping of my event?
  • Policies & legal action associated with data scraping

What is data scraping?

Data scraping is a technique where a computer program extracts information from another program's output. Web scraping is a common form of data scraping, where an application gathers valuable data from a website. People often scrape websites to reuse content, create alternative interfaces, or for research purposes. This is a challenge faced by all web applications.

How is it done?

Data scraping can be done manually or automatically and utilizes various methods:

  • Spiders: These programs follow links on websites to gather specific data.
  • Shell Scripts: Basic scripts using Unix tools to download and extract data.
  • HTML Scrapers: Extract data based on HTML patterns.
  • Screen Scrapers: Use real browsers to extract data from web pages.
  • Web Scraping Services: Professional scraping services often employ proxies to overcome restrictions.
  • Embedding and Mobile Apps: Websites can be embedded in other pages or mobile apps.
  • Copy-and-Paste: People manually copy and paste content for various purposes.

Grip’s position

Grip aims to prevent data scraping by existing behind a log in, limiting abusive usage, and constantly changing its environment to deter automated scraping. The event industry poses some unique challenges due the significant peak and troughs in usage and the release of attendee lists to registered users, which can include data scrapers.

 

What can I do to minimise scraping of my event?

Not having an ‘attendee list’ is going to have a significant impact in making it much harder to scrape your event.

Not using the guest and anonymous login features will mean that only people that have bought a ticket can access Grip, thus deterring scraping. 

 

Badge scanning and the potential for data scraping

As explained in our article on Badge Scanning in order to avoid data scraping or nefarious activity, we strongly advice making sure your registration system supports at least 10 characters of randomised alphanumeric Scan_IDs for the QR /barcodes on the badge. We have advanced monitoring in place to catch any nefarious usage associated with scanning of badges but it’s vital that a complex schema for Scan_IDs is used. 

 

Policies & legal action associated with data scraping

Regardless of whether it is on Grip or on any other platform such as your website, we recommend ensuring your Terms & Conditions and/or “Fair Use Policy” strongly outlines that data scraping is not allowed. It might help to outline the consequences of such activity, (e.g. removal from the event, legal action etc). 

We recommend you work with your general counsel and/or an external law firm on this; Grip might be able to recommend a law firm depending on your geography, please ask your account manager for more information.