Technology Used

Start with the client

We are in business to help bring data science to our customers. Our customers make many different technology choices in running their business. First and foremost we must consider the customers’ technology to make sure that our proposed solution is a good fit in their business. For example, if a customer has chosen to use mostly open source software, it would be a distraction if we suggested a solution that involved licensing. Customers are happier if we can stay within their comfort zone and not require them to change the way that they work.

This is also the case for working practices. We also find that different customers have different ways to manage their projects and it is also important to be in harmony, not just with the type of technology used, but also with the ways it is developed and deployed.

Where is the project leading?

A very important consideration when choosing technology is to consider how a project will develop. We are often asked only to build a simple dashboard. However, as the projects gain traction it is common to increase the amounts of data and the complexity of the reporting. With a technology choice we must start with the end in mind. We consider where the project could be in five years’ time. We may also need to consider radical changes. What happens if the size of data exceeds what we can do with a database? Consequently we must plan for an easily upgrade to a data lake.

Find Out More

The Right Tool For The Job

We very often find that people tend to choose the tool with which they are most familiar. They then will look for inventive ways to force the tool to do something for which it was never really intended. A common example is an Excel sheet that has been extended to hold far too many rows and is being used for an important business process. At JTA we consider technology choices carefully and we look for tools which are appropriate.

Most projects have many needs and each need will require its own technology. This means that it is not sufficient to just chose the tools but we must also consider how they will interface with one another.

The Three Pillars of Data Science

At JTA we consider and plan our work in three main areas. The three areas are as follows:

Data Platforms: How do we load and store the data?
Data Analysis: How will we run algorithms and calculations efficiently?
Data Reporting: How will we show the results so that they are easy to understand?

Each project plan is built by the three areas because each one demands different technologies and also different skills.

Discover how we work with the different technologies

Please select a logo below to find out more

This is used for Data Platforms

Find Out More

When we build a data platform we use R to load new data into the system. R allows us to write scripts that will process the data from receipt to completion often joining multiple sources. This language can run error checks and we can interface with other systems, such as Master Data Repositories and databases. Data preparation is a very important part of data science and R can help with cleaning, outlier detection, smoothing, weighting and aggregation.

T-SQL

This is used for Data Platforms

Find Out More

T-SQL

Many businesses have legacy systems that store information on a database. We use Transact-SQL to develop queries and jobs to automate the extraction of legacy data for storing in a Data Lake.

U-SQL

This is used for Data Platforms

Find Out More

U-SQL

U-SQL is a fusion of SQL and C# that can query huge data sets on a data lake.

Python

This is used for Data Analysis

Find Out More

Python

Python is a high level general programming language that has a comprehensive library of functions. Our preference is to use R for most of our work but Python is better for building applications.

This is used for Data Analysis

Find Out More

R is our go-to language of choice for performing data analysis. There are a large number of code repositories that cover a world of data science techniques available for us to use. In addition we build our own repositories for custom work. R can also interface with other languages such as c++ which allows us to directly modify R objects in memory. This gives us the ability to custom-build highly efficient optimized code.

C++

This is used for Data Analysis

Find Out More

C++

We use various dialects of C to build fast processes that we use in R and also for the Internet of Things. An IoT device is usually built around a microcontroller and these are programmed in C with the occasional use of assembly language.

DAX & MDX

This is used for Data Reporting

Find Out More

DAX & MDX

We often store the results of our analysis in OLAP cubes and these can be built using either the Multi-Dimensional technology or Tabular technology. These cubes allow us to cache the data to make our reports extremely fast and also allow us to calculate derived metrics from the raw data. Secondary metrics like growths, quartiles, means, medians can be computed by the OLAP engine from the base metric in real time.

Java

This is used for Data Reporting

Find Out More

Java

Java is an incredibly flexible language that will work over any platform and which allows us to produce custom visualizations. We tend to use the Microsoft developed version called TypeScript which allows us to use more disciplined approach to programming with Java.

SQL Server

This is used for Data Platforms

Find Out More

SQL Server

Although this is gradually being replaced by data lakes and clusters the venerable SQL structured database still has a very important role to play in data science. We use it particularly to manage Master Data Repositories.

Azure ML

This is used for Data Platforms

Find Out More

Azure ML

This technology simplifies building, training and development of machine learning models that can then be deployed to the cloud.

Data Lakes and Clusters

This is used for Data Platforms

Find Out More

Data Lakes and Clusters

When we are dealing with Big Data we store the information on a Data Lake of Cluster, usually in the cloud. We can then use a variety of compute technologies to query and analyze the data including the well established Hadoop.

Data Bricks

This is used for Data Platforms

Find Out More

Data Bricks

Data Bricks was developed from the Apache Spark service but is offered as a highly integrated service on Microsoft Azure. With great links to Power BI, Azure Data Lake and Azure Blob store and Azure ML. Data bricks also has great interaction with R.

Power BI

This is used for Visualization Tools

Find Out More

Power BI

Power BI is a business analytics service by Microsoft. It provides interactive visualizations and business intelligence capabilities. It also has a very powerful application interface that allows us to use Java and D3 to build custom visualizations for our clients.

D3.js

This is used for Visualization Tools

Find Out More

D3.js

D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. It makes use of the widely implemented SVG, HTML5, and CSS standards.

Data Analysis

Whether your business would benefit from statistics, machine learning, artificial intelligence or forecasting and modelling, our data scientists will have the answer. Either performing the analysis for you or supplying the tools that you need, JTA will unlock the insight in your data.

Find out more

It is necessary to undertake a journey to become a data scientist. One of the best ways to become a data scientist is to join a reputable data science provider like JTA. Data Scientists need to solve problems in a logical and analytical way. Mathematical ability is important, and you will need to understand some algebra, statistics and probability. If you can handle calculus, then so much the better.

The main languages used for Data Science are Python and ‘R’. You will need to learn at least one of these languages.

Next, you will need to understand how data are stored and manipulated, however pay careful attention to big data concepts and techniques.

If you would like to know some more then read about How JTA The Data Scientists does its work or have a look at some other FAQs.

You could also explore our case studies or whitepapers.

Technology Used

Start with the client

Where is the project leading?

The Right Tool For The Job

The Three Pillars of Data Science

Discover how we work with the different technologies

Please select a logo below to find out more

Our area of expertise

Data Platforms

Data Analysis

Data Reporting

Frequently Asked Questions

See how we can make your data speak

Send an enquiry to us below

Get in touch today to discuss your data related requirements