data scientists

See the following -

Big Data Systems Are Making A Difference In The Fight Against Cancer

Ben Lorica | Forbes | January 17, 2014

As open source, big data tools enter the early stages of maturation, data engineers and data scientists will have many opportunities to use them to “work on stuff that matters”. Along those lines, computational biology and medicine are areas where skilled data professionals are already beginning to make an impact. [...] Read More »

Continuum Analytics Teams Up with Intel for Python Distribution Powered by Anaconda

Press Release | Continuum Analytics | September 8, 2016

Continuum Analytics, the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python, is pleased to announce a technical collaboration with Intel resulting in the Intel® Distribution for Python powered by Anaconda. Intel Distribution for Python powered by Anaconda was recently announced by Intel and will be delivered as part of Intel® Parallel Studio XE 2017 software development suite. With a common distribution for the Open Data Science community that increases Python and R performance up to 100X, Intel has empowered enterprises to build a new generation of intelligent applications that drive immediate business value...

Read More »

Data Scientists Need Their Own GitHub. Here are Four of the Best Options

Jordan Novet | Venture Beat | March 1, 2016

Imagine if a company’s three highly valued data scientists can happily work together without duplicating each other’s efforts and can easily call up the ingredients and results of each other’s previous work.That day has come. As the data scientist arms race continues, data scientists might want to join forces. Crazy idea, right?...

Read More »

DocGraph Launches Linea

Press Release | DocGraph | June 1, 2015

DocGraph is launching a new web based portal Linea (http://www.docgraph.org/linea) to enable the health data science community to discover, aggregate and enrich new open healthcare datasets. DocGraph Linea is based on technology developed and contributed by Merck (known as MSD outside the United States and Canada). DocGraph Linea will provide data scientists a socially-enabled community open data platform that collects details about disparate healthcare datasets, and further allows the community to extend what data is available. Users will be able to search datasets, understand data lineage, view relationship matrices, add metadata, and see community algorithms.

Read More »

IBM Announces Major Commitment to Advance Apache®Spark™, Calling it Potentially the Most Significant Open Source Project of the Next Decade

Press Release | IBM | June 15, 2015

IBM today announced a major commitment to Apache®Spark™, potentially the most important new open source project in a decade that is being defined by data. At the core of this commitment, IBM plans to embed Spark into its industry-leading Analytics and Commerce platforms, and to offer Spark as a service on IBM Cloud. IBM will also put more than 3,500 IBM researchers and developers to work on Spark-related projects at more than a dozen labs worldwide; donate its breakthrough IBM SystemML machine learning technology to the Spark open source ecosystem; and educate more than one million data scientists and data engineers on Spark.

Read More »

Machine Learning in Healthcare: Part 3 - Time for a Hands-On Test

Every inpatient and outpatient EHR could theoretically be integrated with a machine learning platform to generate predictions, in order to alert clinicians about important events such as sepsis, pulmonary emboli, etc. This approach may become essential when genetic information is also included in the EHR which would mandate more advanced computation. However, using machine learning and artificial intelligence (AI) in every EHR will be a significant undertaking because not only do subject matter experts and data scientists need to create and validate the models, they must be re-tested over time and tested in a variety of patient populations. Models could change over time and might not work well in every healthcare system. Moreover, the predictive performance must be clinically, and not just statistically significant, otherwise, they will be another source of “alert fatigue.”

Read More »

Why Data Scientists Love Kubernetes

Let's start with an uncontroversial point: Software developers and system operators love Kubernetes as a way to deploy and manage applications in Linux containers. Linux containers provide the foundation for reproducible builds and deployments, but Kubernetes and its ecosystem provide essential features that make containers great for running real applications...What you may not know is that Kubernetes also provides an unbeatable combination of features for working data scientists. The same features that streamline the software development workflow also support a data science workflow! To see why, let's first see what a data scientist's job looks like...

Read More »