What is a data scientist.
Q1 – Definition? – many answers from the 5 members of the panel.
A Data scientist deals with the data. They are excited by data and need to link the data to the problem to be solved. What type of data will be needed to answer the question. Then make sense of the answers in the context.
A very broad field from expert statisticians to self taught statistician who finds connections and analyses data to find answers.
It can be a team of people who, together, solve interesting problems to get at the truth in the data. The scientist aspect involves generating and proving hypotheses.
A computer scientist who works with social scientists. Connecting data for the social scientists to develop insights from their domain knowledge.
Was a statistician, then became a data miner (computation). Now data scientists needs to be an artist who can work out what is the real question which is not a scientific question.
Q2 – is it a new role, is it hype?
It is a new domain to work with social scientists to make their life easier, using analytical techniques, algorithms, visualisations.
Role has already existed but is more visible. It is cool!
Drawers together lots of roles from the past, the title may be new but the activities are pre-existing. Making Sense.
The person who asks the right question that technology can solve using brute force
Q3 Core Skills and required Curriculum
Data engineering, classification, clustering, Networks (of linked data), Graphical networks leading to ideas. Big Data analysis and scalable systems
Must be creative and problem solvers, programming capability.
Need basic maths and statistics, need to see the problem clearly, capable of contextualisation.
Significant number of industry experts needed on the courses. They must learn to read the software manual.
Must build the trusted tools for others to use. Need curiosity. Must know the data that will answer the questions and the business problem. Need experience.
Q4 – What are the hotest new innovative skills for a data scientist?
Optimising the physical world based on instrumented, sensor data
Take structured data and correlate with unstructured data
Making sense of the analysed data in a specific context. Communication skills. collecting and connecting data.
Q4 – Is domain knowledge needed or is it a failing
The data scientist knows the questions and works with the domain experts. Must be able to listen carefully.
See Caggle competitions, needs spark of creativity
Needs team work skills to work with the rest of the Big Data Analytics team.
Q5 – skills in the curriculum
Ethics, Communications skills