Data Engineering versus Data Science, DataOps, and Overcoming Back Office Barriers to Your Next Big Data Project
When you’re ready to expand into Big Data projects, whether that’s streaming, IoT, text or image analytics, or simply creating a data lake who are you going to ask to set that up? Increasingly this is someone called a ‘Data Engineer’.
As recently as a year or so ago the term ‘data scientist’ applied to someone doing predictive analytics as well as the person you would turn to to implement Spark or a data lake. Thankfully over not too long a period we have come to differentiate Data Scientists from Data Engineers and acknowledge their special skill set that blends traditional CS skills with the new disciplines needed to store, extract, and utilize data for data scientists. It’s an emerging skill set and discipline increasingly known as DataOps.
Data Engineers are almost as rare as Data Scientists and it’s one more role we’re asking our IT shops to undertake, with some reluctance on their part. The good news is that one Data Engineer may be able to support several Data Scientists. The bad news is that they’re still tough to find.
That’s led Qubole and our speakers Minesh Patel and Spencer Huang to start providing Data Engineering as a service so that you can stand up that data lake or streaming app or any other support needed by the data science team quickly, cost effectively, and have the top talent in the field on call for support. Minesh and Spencer will cover:
§ Data Engineering vs Data Science
§ Exploratory stages -- what roles do data engineers play and what roles do data scientists play
§ Why data engineers are necessary
§ What is DataOps
§ Some use cases of how Qubole can help data engineering as well as data science efforts plus a demo of some of the more helpful features of Qubole.
About our Speakers
Minesh Patel is a Technical Director & Solutions Architect with Qubole. He is one of the first two solution architects at Qubole and has deep experience helping customers tackle real world, complex problems in Big Data. Minesh has a broad range of expertise including Big Data, Cloud, In-Memory Solutions, Performance and Scalability, Integration, BPM, Implementation, and Troubleshooting. He holds a B.S. in Computer Engineering from UCSD and a Masters in Software Engineering from SJSU.
Spencer Huang, in his role as Director of Strategic Accounts helps companies maximize the value from their data assets by uncovering insights to drive new revenue or lower operational costs. Qubole uses the most current technologies optimized for the cloud including Spark, Presto, Hadoop, Hive, etc. Spencer holds a B.S. in Electrical and Computer Engineering from Rutgers, and an MBA from USC.
Qubole is a big data-as-a-service company that provides a fast, easy and reliable path to turn big data into valuable business insights. Qubole’s cloud-based platform addresses the challenges of processing huge volumes of structured and unstructured data. It uses clouds such as Amazon Web Services, Google Compute Engine, Microsoft Azure and Oracle Cloud Platform to help enterprises extract value out of their big data while enabling their operations teams to be nimble and adaptive to their users’ needs. Qubole achieves this through features such as auto-scaled big data clusters and integrated toolsets for data analysts, developers and business users. With more than 500+ PB of data processed every month across its customer base, Qubole’s platform makes enterprises agile with big data.