There is so much digital data being generated everyday. Right now, the current number stands at a whooping 1.145 trillion MB a day. With more and more digitisation happening around the world, that massive figure is only going to get bigger. 

So big, that there are not one, but three professional roles specifically set up just to monitor and analyse the incoming as well as existing data. Within the realm of data science, specialised 

designations such as data analyst, data scientist, data engineer have been formed to tackle this 

digitised bulk information, and no doubt, many more will be created due to the high demand. 

Enough about what the world of big data is up to. Let’s talk about you. Are you looking to carve a niche for yourself in the field of data science? Are you confused as to which role is suitable for you? Can’t decide if you should go for a data scientist course or look at some of the reputed data analytics courses instead?Good thing you’re here then. Keep reading to find out which of the three roles — data analyst, data scientist or data engineer — is the one ideal for you. Once you’re sorted, you can go out and get that coveted data science certification

We could just straight up give you the details about each role, but no, we’re doing something 

different instead. We are going to help you truly understand how each job differs by using the Covid-19 vaccination rollout as a context. We explain how each of these roles are unique so you can get a clearer, more practical idea of what each job entails. Once you reach the end of this post, you’ll have a better understanding of what you want to do, and which is the best data science course for you.

Below is the scenario to further explain these three roles. 

Scenario:

  • Three different vaccines have been approved for use by the government 
  • The initial stocks have been despatched by the government to hospitals across the country 
  • Both public as well as private hospitals have received the stocks and are preparing to start the vaccination process

 Data engineers are the first to get to work. They start by getting in touch with the hospitals, only to quickly realise that different hospitals have different ways of handling things. 

  • Each hospital differs on how information, such as vaccine details, patient details or health worker details, will be registered
  • They will have different ways of storying the data, whether on a specific computer software or a paper form 
  • They will also vary on how to report this information back to the government. One hospital may choose a computer file, another may send a paper printout, a third place may choose to 
  • communicate via WhatsApp messaging.

It can get quite confusing. This is where the data scientist steps in and decides how to organise the data so that it is cohesive and easy to use by the Health ministry. So here’s what they do next:

They first get confirmation from the government by getting answers to the following questions. 

1. How this data will be used? Will it be used for reporting, analysis or the archives?

2.How regularly will it need to be updated? Will it happen in real time or on a weekly/monthly basis?

3. Is this sensitive date properly secured from theft or misuse? What security measures are being used? 

After much deliberation and careful consideration, the data engineers are ready to start work. They decide to:

1. Standardise data fields so everyone is able to access the same information.

2. Standardise data submission and collection so that the same format is used across the board.

3. Standardise the frequency of submission to ensure that all data collected is within the same 

timeframe

4. Store the data in an accessible manner so information can be easily retrieved when needed.

5. Provide access to the authorised users while preventing unauthorised people from gaining 

There you have it. The data engineers have got the job done. Now the vaccine data is accurate, 

up-to-date and accessible. Which means the data analysts are ready to step in. The health 

ministry gets them on board to work on the next step. They are given a one-line brief. To study the data in detail and analyse the occurrences.

The data analysts call an urgent meeting to discuss what the health ministry needs. After much 

discussion they arrive at the following list of things that can be measured:

1) How many vaccines were issued? The data is sorted according to date, type of vaccine, hospital, type of  hospital, city, state and at a country level

2) How many vaccines were used, misplaced, broken or unaccounted for? Again this information is sorted by date, type of vaccine, hospital, type of hospital, city, state and at a country level

3) Who administered the vaccines? The doctors or nurses? Again this is classified by date, by state and at a country level

4) Who received the vaccines? List the data by age, gender, type of ID proof, along with by type of vaccine, by city, by state and at a country level

The data analysts have sorted this important data efficiently and are now very happy with

themselves. They have create summaries of all these points and are ready to share them with the Health Minister. The Health Minister is suitably impressed. He approves of how they have 

summarised what has happened so far. Now, he needs more. Can  they keep him informed of what may happen with the vaccination drive in the future?”

“No, but we can,” a few mysterious voices say in unison. Enter the data scientists. The radical team on the data force that is going to look inside crystal ball and figure out the future. In the 

absence of a mystical glass sphere, they’ll just look at the data analysis instead.

After carefully examining the existing data, they use advanced statistical techniques to provide the following forecasts:

  1. The maximum number of daily vaccines that can be administered, as per the number of 

registered doctors and nurses across the country

2) The time it would take to vaccine the entire country, based on both current and maximum daily vaccines, calculated at state and national level

3) The order quantity and frequency of vaccines by type, based on production capacity, prescribed gaps between 1st and 2nd vaccination and other factors 

4) The likely distribution of vaccines across the country, as people started to return to urban cities in search of work again. These will be based on rail and air data taken from other sources. 

The data scientists have successfully filed their predictions, enabling the health ministry to make informed decisions.  

In conclusion

There you have it — the difference between Data Engineers, Data Analysts and Data Scientists. While these three roles may appear somewhat similar, they do have certain unique characteristics that set them apart from each other. To know more about becoming a data scientist, check out data scientist courses that will equip you with the tools to perform data-based predictions. Whereas there are data analytics courses that do the same for aspiring data analysts. Once you receive your data science certification, you are ready to put your skills to the test and tackle the world of big data. 

To find out more about the best data science courses, click here

Unsure about that data science certification? Take our 6-step aptitude test here to know more

Ask Proschool