Totally new approach to spatiotemporal data analysis

Table of contents

The best opportunities where the use of spatiotemporal data can improve our lives.
This chapter provides an in-depth description of using spatiotemporal data in our everyday lives. Several typical examples of spatiotemporal Big Data use cover topics about safety public transport tracking on a large scale based on India AIS-140 case, reducing traffic congestion, and usage-based insurance.
Typical issues that need to be solved with spatiotemporal data analysis solution.
Existing telematics solutions often have big problems with a lack of flexibility, expensive scalability, poor precision, and poor responsiveness. Customers are usually looking for an all-in-one solution with an emphasis on real-time data analysis. A brief overview of the issues among commercially available solutions reveals which of them are fully covered within the SpaceTime solution.
Some of the key differences between existing spatiotemporal data analysis solutions.
Several different technologies can be used in working with big geospatial data. This chapter provides a systematic overview of the existing spatiotemporal data analysis solutions — detailed information with proofs about why is SpaceTime solution more powerful and efficient than competitors.
What is the Mireo SpaceTime solution, and how does it work?
SpaceTime is a software platform for storing and processing geotagged and timestamped data. It can collect an enormous amount of data and get real-time analysis for an unlimited number of vehicles. For a better understanding of our spatiotemporal data analyzing solution, the way how does SpaceTime works is described in detail.
Glossary: the language of spatiotemporal data.
We are taking benefit of the global positioning technology to provide organizations and individuals travel on roads easier, safer, and more efficient. To enhance understanding of our work, this section provides a snapshot of the most popular terms we use in our everyday life at Mireo.

"The future belongs to those who see possibilities before they become obvious."

"(John Scully)"

The best new opportunities where the use of spatiotemporal data can improve our day-to-day lives.

Congestion pricing with the help of GPS

Traffic congestion causes a lot of costs

Traffic congestion is a big issue for cities around the world. People in large cities and urban areas spend about 100 or more hours per year being stuck in a traffic jam. It's truly annoying, and it impacts the citizen's quality of life. But these lost hours are not just a difficulty for individuals as much they are a significant problem in the economic effects. Instead of being productive, employees are stuck in traffic jams, and this causes a lot of costs and billions of losses. Furthermore, due to a high level of fuel consumption and CO2 emissions, worldwide environmental pollution is getting worse. The latest modern technologies provide the best way to make a crucial reform in solving the traffic congestion problem.

Reducing traffic congestion includes innovation

One of the latest approaches to reduce traffic in large cities is the use of the concept of "congestion pricing." So far, this concept is implemented in cities like Singapur, Dubai, London, Rome, Milan, Stockholm, and a few other, smaller cities like Valletta and Riga.

"Congestion pricing" concept is the toll collection model where drivers are charged when using vehicles within the congestion charging zone. So far, in the cities mentioned above, the "congestion pricing" concept is implemented through automatic number-plate recognition. Singapore, with the in-vehicle units in every vehicle, has the most sophisticated solution by now. Singapore is the first city which provides the congestion prediction an hour in advance, and its overall prediction preciseness is around 90%. But even though Singapore has the most sophisticated system, it's still far from being perfect. Infrastructure maintenance is expensive, preciseness is not 100%, and vehicles are charged for entering the congestion charging zone, not for the real mileage driven in the area.

Lately, an entirely new approach to reducing traffic congestion is being developed by the companies within the GPS tracking industry. "Congestion pricing" model based on GPS technology provides pricing policies based on time, distance, and place. In such a system, vehicles are charged based on when, where, and how much they drive. Adaptation to evolving needs in using the "congestion pricing" model for reducing traffic congestion is simple, rapid, and cost-effective. It's easy to add new sections of roads to the toll scheme, and GPS systems allow for around 80% less roadside infrastructure, which consequently minimizes the environmental impact and installation costs. Furthermore, GPS sensors inside the vehicle provide additional different driving information. Authorities could use collected aggregated and anonymous data to improve traffic policies, create a new one, and reduce traffic congestion.

Preciseness and reporting are two major challenges

It's pretty straightforward to conclude that GPS technologies can have a significant role in creating a sophisticated overview of the traffic circumstances. The main objective is to optimize traffic flow, which would be helpful to both congestion and accident reduction.

But, there are two significant challenges related to this concept:

Preciseness: The trajectory reconstruction needs to be extremely accurate. GPS accuracy depends on satellite coverage. Many factors can affect the accuracy of satellite coverage, such as atmospheric conditions or the quality of the GPS device.
Reporting: The amount of data that needs to be stored is huge. If there are 1.000.000 vehicles, and each vehicle sends about 40 different parameters every 100 meters, the math is clear.

According to all benefits and challenges mentioned above, modeling traffic congestion predictions with spatiotemporal data collected through GPS devices has become one of the most popular business cases among companies engaged with GPS technology. The spatiotemporal data analysis solution, which can help in modeling traffic congestion reduction, must be extremely fast in data analyzing. Rapid creation of different real-time reports can provide valuable insights to authorities. According to the data, authorities could create new and improved traffic policies with better speed limit regularization on various roads. Although there are some challenges in implementing such a spatiotemporal data analysis solution, the importance of using the spatiotemporal data in traffic congestion reduction is yet to come.

GPS tracking and Usage Based Insurance programs

Computation of car insurance premium is not an easy task

All car owners in the world share a common concern - the amount of their car insurance premiums. All insurance organizations in the world share a common interest - the amount of profit after paying out insurance claims.

Car insurance is obligatory in almost all developed countries. There are many factors when arranging car insurance premiums, from vehicle to driver-related parameters. Every insurer's goal is to estimate the amount he charges a car owner with how likely a car owner is to claim an insurance payout. Age, gender, place of living, driving history, car type, and a lot of other parameters are just some of the secret ingredients insurance company uses to calculate the premium amount. As well, some insurers offer driver's to implement GPS tracking technology for their vehicles, because there is a growing trend of using data collected from the GPS tracking for driver's risk assessment. GPS data combined with usual insurance metrics can give very accurate models for risk assessment, and consequently, better insurance premiums amount.

GPS tracking has a lot of benefits in the usage-based insurance

Searching Google about the benefits of GPS tracking in the car insurance industry results in thousands of different articles. The main idea of using GPS car tracking in the risk assessment is to offer usage-based insurance (UBI). GPS car tracking data provide valuable information about drivers driving style, which can help with the "pay how you drive" car insurance model.

There are two ways in which GPS car tracking for insurance can be done - by data from an accelerometer and by data from GPS devices. Accelerometer data are used to detect aggressive driving style, and for detecting crash events. Data from GPS devices are primarily used to identify where the crash happened and for detailed business intelligence. Although a lot of companies collect data from accelerometers, it's important to emphasize that there is no scientific study that confirms the correlation between driving style collected by accelerometer and the probability of an accident.

Regardless of the enormous benefits of using GPS technology in the car insurance industry, there is one reason why it isn't used more often for the risk assessment. GPS is accurate within 5 meters 95% of the time, but the remaining 5% makes problems. Also, GPS positions are not continuous; they are sequences of vehicle positions somewhere on the map. These positions must be map matched, connected, and combined with road characteristics. The process of GPS map matching and path reconstruction is a pretty challenging task. Map matching represents a process of guessing the most probable route a vehicle has traveled, based on received coordinates. The use of map matching process and route reconstruction is wide, but it would be wider if it weren't so tricky. More about the map matching process and why it is so difficult can be found in our blog post dedicated to the introduction to the map matching process. Because of that, GPS providers don't provide insurance companies with real business intelligence, and without business intelligence, there is no accurate risk prediction.

The database must be able to store tons of spatiotemporal data

Crucial parameters for insurance risk assessment are travel time (day or night), roadway type (highway, local roadway, intersection), turn type (left, right), and total time spent on the road. All of this data can be collected through the GPS tracking system, but it needs to be stored in a particular format that includes latitude and longitude (x and y coordinates) for each position. Because the average vehicle sends about 40 different parameters every 100 meters via a GPS tracking system, it is easy to conclude that it is a tremendous amount of data for storage. Since ordinary databases do not meet all the performance required to process such large amounts of data quickly, it is necessary to use a spatiotemporal database - the one that can handle an unlimited number of data in just a few seconds. Also, extremely fast query engine over the spatiotemporal database and precise map matching algorithm with 99,98% of preciseness are essential for the proper data analyzing. The fine details of the art about how the process of map matching is implemented in Mireo, and how we consider tracked GPS positions as candidates for the real-life positions are collected here. With such a system for spatiotemporal data analyzing, insurance companies can ensure a much more accurate insurance premium based on the driver's risk assessment. That could definitelly be a breakthrough in their industry.

Safety in public transport - AIS-140 standard in India

Transport security is one of the major problems in India

Women's safety in public transport is one of the critical challenges worldwide. In some countries that are still in development, lots of women are forced to stay at home because of not having a safe transportation option. The authorities across the globe are taking various measures to make public transport secure and pleasant for women.

The government of India decided to take some significant measures to protect women and girl child citizens. After the Nirbhaya incident, which happened in 2012 in India, and which is almost the synonym for sexual violence in public transport, the government in India had formed the justice committee. The main goal of the committee was to propose strict rules for dealing with rape cases in public transport. Despite the committee, the number of rape and sexual harassment cases was growing. In 2015, around 88% of women in India felt unsafe while using public transport.

Finally, on 28th November 2016, the Ministry of Road Transport and Highways (MoRTH), Government of India, had announced and mandated the installation of GPS vehicle location tracking devices and one or more emergency buttons in all public vehicles, starting from 1st April 2018. This announcement gave the entire system enough time to comply with the new regulation. Furthermore, this triggered the development of a robust automotive industry-standard, AIS-140.

About the AIS-140 standard

AIS-140 standard is a set of rules and norms published by the Automotive Research Association of India (ARAI). Under the AIS-140 standard, the government of India has designed an Intelligent Transport System (ITS). AIS-140 is designed to ensure the quality and reliability of GPS devices used for emergency cases, and ITS principal purpose is to improve the transportation system in terms of efficiency, quality, comfort, and safety.

Beginning from 1st April 2018, the AIS-140 standard is being implemented across the country for all public transport vehicles. The industry expected demand of approximately 4.4 million public service vehicles, including State Road Transport Corporations, Domestic and Interstate Private Bus Operators, Ambulances and Emergency Response Vehicles, All Educational Institutions, Cars, Bus, Taxi Fleet Owners, Rent-a-car services, etc. Complying with the AIS-140 standard means that every commercial vehicle needs to have a real-time GPS tracking system with emergency request buttons (panic, SOS button).

How does the emergency button work?

The AIS-140 compliant GPS tracker transmits the vehicle location data in real-time. Using the GPS satellites, network provider, and SIM card embedded in the GPS tracking device module, GPS devices send spatiotemporal data and other data related to the emergency (panic) button to the national backend data center. The data from the data center is analyzed by the command and control center. In the case of an emergency, they are responsible for informing emergency services. Thanks to the spatiotemporal data (latitude, longitude, time) collected through the GPS tracking devices, emergency services can react on time and prevent any unwanted situations.

AIS-140 standard brings positive impacts on the industry

With all of the above, this government regulation has created a lot of new job opportunities in India. There are more than ten AIS-140 compliant solution providers in India. Complying with the AIS-140 standard involves various small and medium companies. Some of the companies manufacture compatible hardware for the emergency (panic) button, and some of them are developing software for fleet management, which relies on the AIS-140 GPS tracking devices. Anyway, job opportunities refer to skilled and non-skilled workers, so everyone wins.

A good fleet management system has a significant role in the whole process

The amount of data collected from the AIS-140 compliant GPS devices is monstrous. Moreover, the undeniable fact is that high-speed processing and real-time data analysis is essential for preventing dangerous situations. We can conclude that a fleet management system that provides real-time tracking and reporting of an unlimited number of vehicles is vital for the government and ITS system. But above all, the reliable fleet management system is imperative for this whole, the incredibly worthy process of protecting women from sexual harassment in public transport.

"Intellectuals solve problems, geniuses prevent them."

"(Albert Einstein)"

Typical issues that need to be solved with spatiotemporal data analysis solution

Handling spatiotemporal data is everything but simple and straightforward. All of the commercially available solutions have some issues, but some of the common challenges are:

Lack of flexibility

Existing telematics solutions provide customers with many various reports. However, adding new reports is an expensive and long-lasting process. Customers would like to have a solution where they can generate any report almost in real-time. They don't want to have some predefined reports, and they don't want to depend on anybody if there is a need for change.

Poor responsiveness

Reports should be generated instantly to provide the customers with a great user experience. Existing telematics solutions often have big problems with creating real-time reports. In some cases and industries, real-time reporting is crucial, so poor responsiveness is not an option.

Expensive scalability

Usually, fleet management solutions handle spatiotemporal data from less than 20.000 vehicles perfectly. Adding new vehicles into the GPS tracking system with more than 20.000 vehicles is hard, and in most cases, it doesn't work well. Analytical processing of data would eventually require exponentially more time as the data are inserted, and the number of records surpasses a few billion. In that case, customers typically use more than one fleet management solution, which at some point can become hard to handle. Customers are looking for a solution where scalability could ensure simultaneous tracking, real-time analysis, and reporting for an unlimited number of objects. To clarify why is the scalability of crucial importance for all systems which are processing data produced by moving objects, we wrote a more detailed blog article that can be found here.

Poor precision

GPS satellites broadcast their signals in space with precise accuracy. But many things can reduce GPS positioning accuracies, such as atmospheric conditions or satellite signal blockage due to buildings, bridges, etc. GPS is accurate to within 5 meters 95% of the time. The problem is the remaining 5% because customers need correct information, not just indications. It is up to fleet management provider to process GPS data and reconstruct trajectories precisely and accurately.

"Strength lies in differences, not in similarities."

"(Stephen Covey)"

Some of the key differences between existing spatiotemporal data analysis solution

Several different technologies (GeoSpark, GeoMesa, Hive, MongoDB, etc.) can be used to work with big geospatial (spatiotemporal) datasets. Given the number of framework choices available, the processing of massive spatiotemporal data may look like a straightforward task.

When choosing a particular solution for spatiotemporal data analyzing and processing, the fastest and the most feature-rich framework is not always the best idea. Database size and its daily growth is a crucial decision factor. Some query operations, such as identifying whether two geospatial objects overlap, are often expensive.

Indexing a huge amount of real-world spatiotemporal data

For example, in the context of AVL tracking applications, the underlying hardware infrastructure should support approximately 100 billion geotagged and timestamped records generated by 100.000 vehicles per year. The major problem within the existing solutions mentioned above is that the analytical processing of data would eventually require exponentially more time when the number of records that need to be stored surpasses few billions. The main reason for that is the fact that properties of spatiotemporal indexes used in analytical processing are not suitable for indexing a huge amount of real-world spatiotemporal data. Spatiotemporal data streams tend to be highly skewed, or very non-uniformly distributed in space-time, and that means that some portions of spatiotemporal data are covered with high-density set of records, while others are virtually empty. There are three major spatiotemporal index types used in the existing aforementioned solutions: Geohash, kd-family, and R-tree family.

Geohash spatiotemporal index is used among GeoMesa and various variants of products found on Google Cloud. Geohash uses a fixed-grid spatiotemporal partition indexes, and the index traversal on high density areas are very linear with no possibility for more granular partitioning. Geohashed-based indexes have good record filtering only if finitely bounded hyper-rectangles are used for range queries. Then, using Geohash above a simple query of the form "fetch full history for one single vehicle for the period of one year" will lineary scan the complete database. Also, the fixed-grid partitioning used in Geohash has one more problem: interval data sets have no exploitable order.

kd-tree family of indexes are the most efficient type of indexing n-dimensional interval data. kd-tree family index is used within Oracle Spatial solution. There is one major drawback when using kd-tree family indexes among spatiotemporal data. Since they are built on the top of static data sets it is possible to efficiently index even a large interval data set but only if the underlying recordset is afterward immutable.

R-tree family of trees index is used within PostgreSQL/PostGIS, Oracle Spatial, and MemSQL solutions. The major problem of using R-tree family of trees index on a large scale is that index over high-density areas will effectively become linear.

"Big things often have small beginnings"

"(unknown)"

Real knowledge consists of a lot of experience

We've spent the past few years working on designing a database engine for managing spatiotemporal data. There are several different reasons why we decided to develop our spatiotemporal database among all of the other state-of-the-art solutions. A detailed explanation about that can be found in the blog post with a suitable headline: Why would a normal person build its own spatiotemporal database? At first, we thought that the design of the scalable spatiotemporal database would be just a straightforward task. But it turned out it's everything but simple and straightforward. In this overview, we decided to share our experience with you.

"Some people dream of success, while other people get up every morning and make it happen."

"(Wayne Huizenga)"

What is the Mireo SpaceTime solution, and how does it work?

A quick overview of SpaceTime solution

SpaceTime is a software platform

SpaceTime is a software platform for storing and processing spatiotemporal (geotagged and timestamped) data collected from GPS tracking devices. SpaceTime can track data from any kind of moveable object (vehicles, humans, etc.), although here in Mireo, we mostly work with tracking data gathered from vehicles. The design of the system allows scaling to the size which handles simultaneous tracking and real-time analysis for an unlimited number of vehicles.

SpaceTime Cluster is an AVL system

SpaceTime Cluster is an Automatic Vehicle Location (AVL) tracking system for monitoring, detailed behavioral analysis, and management of moveable objects. It uses a GSM radio packet network for communication with tracking devices in moveable objects. SpaceTime Cluster collects and stores data into a SpaceTime database in real-time. It also keeps historical objects' records for an indefinite time.

SpaceTime database is a different query execution engine

SpaceTime database is a massively scalable, distributed, disk-based storage, and query execution engine. SpaceTime database is a part of SpaceTime Cluster. The database is designed for rapid insertion and querying of spatiotemporal (interval) data types. It provides unprecedented analytical tools speed. The query execution engine is roughly 1,000 times faster than any other state-of-the-art spatiotemporal data analysis solution. SpaceTime database is a relational SQL database with the architecture of a NewSQL sort of DBMS. As a distributed database, it can be queried using standard SQL without the need to know anything about the underlying SQL. Because of the strive to be the fastest, SpaceTime database uses an utterly different approach to interpretation and compiling SQL queries so they could perform quickly and efficiently. A detailed analysis of compiling SQL queries in Mireo SpaceTime is covered in one of our blog posts. It is designed as scale-out architecture with partitioning, sharding and clustering techniques built in the core of their engines. Alongside this, it maintains ACID database properties (atomicity, consistency, isolation, and durability).

The amount of data collected
is massive

The amount of data collected from vehicles is enormous. Given the fact that typical reporting frequency for the moveable object is configured in the way that objects (vehicles) send their location and other sensor data approximately every 3 seconds while the vehicle is moving, the math is undeniable. For real-time tracking of 2 million vehicles and keeping their log history for 5 years SpaceTime needs to store more than 105 trillion positions with spatiotemporal data. The storing of the mentioned amount of data in SpaceTime is efficient, but the most important is that system provides fast searching. Challenges and drawbacks, as well as deficiencies of the currently available solutions encountered during the development, can be found here.

The major implementation details and fundamentals of the
SpaceTime database

The SpaceTime database provides a multidimensional indexing scheme for interval data-adaptive to record distribution, support for a virtually unlimited number of records in the database, and the fastest possible (un)bounded hyper-rectangular querying. Simultaneously, the implementation squeezes out every possible bit from the hardware. Index and record data in the SpaceTime database are partitioned into separate subsets called partitions or shards, which are uniformly distributed among the nodes by using a consistent hashing algorithm. Technically, these subsets are organized as fixed-grid hypercubes, which is described in one of our blog posts.

Unique, n-dimensional interval data index

SpaceTime database is built on unique, n-dimensional interval data index, which automatically adapts to varying and possibly highly skewed data distribution. There is one single multi-columnar index that indexes all associated columns of records at once. There are no secondary indexes. The physical model intertwines index and record data, which almost completely removes pointer-to-data resolution. The physical locality of the data in the index highly preserves spatiotemporal locality of records, which in turn maximizes the throughput of disk reads.

SpaceTime index is similar to kd-tree family of indexes with two major improvements: first, the index tree in SpaceTime is built from the bottom-up, as opposed to the top-down kd-tree construction. The bottom-up approach essentially allows iterative index creation; in other words, it natively supports mutable datasets. Second, the process of index creation reflects the space-time distribution of data inherently.

Fixed-grid hypercube partitioning model with simple set of APIs

The data partition scheme (or sharding scheme) responsible for data distribution among storage and computation nodes is similar to fixed-grid hypercube partitioning model. However, no space filling curve based ordering is used anywhere in the system. The cluster workload distribution logic ensures maximal possible query task parallelization by using this partition scheme.

The SpaceTime core storage/indexing engine exposes an extremely simple set of APIs. Besides simple insert/update/delete APIs, there are only APIs for (un)bounded hyper-rectangular records filtering, and k-Nearest (kNN) queries. More complex computation, aggregation, and filtering are performed by on-the-fly generated assembly code through which pre-filtered records flow in push form (as opposed to the traditional RDBMS "Volcano" iteration model).

LLVM compiler

SQL queries in the SpaceTime database closely resemble MapReduce computational model. That is, upon parsing and AST optimization, queries are translated to assembly functions using the LLVM compiler (learn more here) and spread across relevant nodes in the system. The results of these micro-tasks are aggregated by the database Smart Client and returned to the user the same way as with the MapReduce paradigm, leveraging extensively the fact that most aggregating functions are idempotent.

Highly available, fault tolerant, and natively horizontally scalable

As many other distributed systems, SpaceTime database is highly available, fault tolerant, and natively horizontally scalable. High availability is achieved by massive parallelization of query and insertion tasks, even on a single node. Fault tolerance is implemented by using real-time block device replication (DRBD) with the coordination service being Apache Zookeper. The quorum-based synchronization between database nodes is completely avoided by the combination of data partitioning, low-level block device replication and high-level synchronization through Zookeeper.

Local SSD storage through the standard POSIX API

SpaceTime database uses exclusively node local SSD storage (it supports both consumer-grade SSDs up to professional NVMe locally attached storage) through the standard POSIX API. Although cluster file system would be a logical choice (Ceph, GlusterFS, HDFS, to name just a few open-source examples), it is not possible to use maximal disk bandwith with the level of abstraction which those file system bring. The main bottleneck with cluster file systems is the network and its latencies which prevents (at least in commodity hardware world) file reads to reach maximal disk throughput. By using DRBD low level replication SpaceTime engine ensures that all data are always read directly from the local disk storage, bypassing completely the network stack. At the same time, writes to a disk are atomized and distributed by the kernel logic in DRBD, thus providing consistently and durability of data.

Commodity hardware, dedicated hardware or the cloud

SpaceTime database code (written in C++) strictly follows shared-nothing multithreading philosophy. The synchronization among threads is reduced to an absolute minimum, and worker threads are pinned to fixed CPU cores. This approach eliminates expensive processor cache synchronization steps and maximizes CPU usage. As a benchmark, SpaceTime fetches from index and processes roughly 50.000.000 records per second on a single quad-core Intel i5 machine clocked at 3GHz, with Samsung 960 Pro NVMe and 16 GB DDR4 RAM. Worker threads in SpaceTime engine thus easily saturate NVMe SSD, getting about 360.000 IOPS routinely.

SpaceTime engine is designed and implemented to run on commodity hardware, on dedicated hardware, or on the cloud. When running from the cloud, the only requirement is that SpaceTime must have access to locally attached dedicated SSD disk. SpaceTime can be easily deployed, even using the Docker image.

"Language matters. It shapes thoughts."

"(unknown)"

Glossary: the language of spatiotemporal data

Here at Mireo, we don't speak our unique language. But just to make it clear, let us help you get familiar with some terms we use in our everyday life here, and which were mentioned in this overview.

GPS tracking

GPS tracking is the system of monitoring and tracking the location of any moveable objects (vehicles, trucks, ships, and even humans and animals). GPS tracking uses GPS satellites to determine the location of GPS devices. GPS devices (GPS trackers) are portable devices designed to broadcast GPS location. GPS tracking provides you the exact location of your moveable object in real-time, which is extremely useful to individuals and businesses.

Fleet management

Fleet management is a methodology that uses GPS tracking for supervising any kind of fleet. All relevant information is stored in a fleet management software (GPS tracking solution). Fleet management software can process and generate reports in real-time, which helps companies and fleet owners in making further decisions about their business. Fleet management software has limitations in the fleet size, which is roughly limited to 20.000 moveable objects.

Massive GPS tracking

When we are talking about GPS tracking of more than 20.000 moveable objects, we are talking about massive GPS tracking. State-of-the-art fleet management solutions rely on the relational databases which would fall apart under the significant amount of spatiotemporal data. That's why massive GPS tracking is always related to the spatiotemporal databases.

Spatiotemporal data

Spatiotemporal data types are multidimensional. They consist of latitude, longitude, and time, and cannot have less than two values. One of the best examples of spatiotemporal data types is data types collected from GPS tracking of moveable objects.

Spatiotemporal database

Spatiotemporal databases are designed for the storage of spatiotemporal data types. They need to be scalable, distributed, and disk-based. Since spatiotemporal data types are multidimensional, query execution engine needs to be very quick. That's why spatiotemporal databases are used for storing data collected from massive GPS tracking.

Relational database

Relational databases store data in tables. Some of the popular relational databases are MySQL Database, Oracle, MsSQL, PostgreSQL, and Sybase. Communication with relational databases is achieved through the standard programming language called SQL (Structured Query Language).

NewSQL database

NewSQL database is a web-scale distributed relational SQL database. It is designed to overcome the limitations of traditional SQL databases in terms of BigData processing. NewSQL database has a scale-out architecture with partitioning, sharding, and clustering techniques built into the core of its engine. Some of the popular NewSQL databases are MemSQL, VoltDB, NuoDB, and ClustrixDB.

SpaceTime database

In terms of architecture, the SpaceTime spatiotemporal database is a NewSQL database. SpaceTime database is not a general-purpose SQL database. It implements only a subset of the SQL language and lacks many standard features. The primary purpose of the SpaceTime database is the highly efficient storage and querying of spatiotemporal data types.

SpaceTime Cluster

The SpaceTime Cluster is an automatic vehicle GPS tracking system (AVL). It is used for supervision, comprehensive analysis, and movable objects management (fleet management). The tracking cluster collects, stores, and analyzes spatiotemporal data from moveable objects in real-time. SpaceTime database is a part of SpaceTime Cluster.

Spatiotemporal data analysis

The data from moveable objects are sent to SpaceTime Cluster at a prescribed frequency. The amount of data collected from vehicles is gigantic since moveable objects typically send location and sensor data approximately every 100 meters. The SpaceTime Cluster provides real-time spatiotemporal data analysis, with instant access to all historical spatiotemporal data.

The definitive guide through an unusual and fundamentally new approach of using spatiotemporal data in our everyday routine