Apache Spark

See the following -

10 Ways Big Data And Data Science Impacted The World In 2020

Lauren Maffeo | Opensource.com | January 19, 2021

Big data’s one of many domains where open source shines. From open source alternatives for Google Analytics to new features in MySQL, 2020 brought several ways for open source enthusiasts to learn big data skills. Get up to speed on how open source data science languages, libraries, and tools help us understand our world better by reviewing the top 10 data science articles published on Opensource.com last year.

Read More »

3 Emerging Open Source Data Analytics Tools Beyond Apache Spark

On the data analytics front, profound change is in the air, and open source tools are leading many of the changes. Sure, you are probably familiar with some of the open source stars in this space, such as Hadoop and Apache Spark, but there is now a strong need for new tools that can holistically round out the data analytics ecosystem. Notably, many of these tools are customized to process streaming data...Streaming data analytics are needed for improved drug discovery...While Apache Spark grabs many of the headlines in the data analytics space, given billions of development dollars thrown at it by IBM and other companies, several unsung open source projects are also on the rise. Here are three emerging data analytics tools worth exploring:

Broad Institute to Release Genome Analysis Toolkit 4 (GATK4) as Open Source Resource to Accelerate Research

Press Release | Broad Institute of MIT and Harvard | May 24, 2017

The Broad Institute of MIT and Harvard will release version 4 of the industry-leading Genome Analysis Toolkit under an open source software license. The software package, designated GATK4, contains new tools and rebuilt architecture. It is available currently as an alpha preview on the Broad Institute's GATK website, with a beta release expected in mid-June. Broad engineers announced the upgrade, as well as the decision to release the tool as an open source product, at Bio-IT World today...

Data Science Jobs Report 2019: Python Way Up, Tensorflow Growing Rapidly, R Use Double SAS

In my ongoing quest to track The Popularity of Data Science Software, I've just updated my analysis of the job market. To save you from reading the entire tome, I'm reproducing that section here.One of the best ways to measure the popularity or market share of software for data science is to count the number of job advertisements that highlight knowledge of each as a requirement. Job ads are rich in information and are backed by money, so they are perhaps the best measure of how popular each software is now. Plots of change in job demand give us a good idea of what is likely to become more popular in the future. Read More »

DataStax And Databricks Stack Bricks Of Data

Adrian Bridgwater | Open Source Insider | May 8, 2014

Apache Cassandra company DataStax is snuggling up with Databricks.  The partnership is designed to deliver open source code back to the Apache Spark and Apache Cassandra communities to ensure that developers always have the best tools...

Read More »

Hot Programming Trends from 2016

Technology is constantly moving forward—well, maybe not always forward, but always moving. Even for someone who keeps an eye on the trends and their effect on programmers, discerning exactly where things are headed can be a challenge. My clearest glimpse into open source programming trends always comes in the fall when I work with my fellow chairs, Kelsey Hightower and Scott Hanselman, and our fantastic programming committee to sculpt the coming year's OSCON (O'Reilly Open Source Convention). The proposals that we get and the number focused on specific topics turn out to be good indicators of hot trends in the open source world. What follows is an overview of the top programming trends we saw in 2016...

How Open Source Is Changing the Pace of Software Development

...as new computing architectures and approaches rapidly evolve for cloud computing, for big data, for the Internet of Things (IoT), it's also becoming evident that the open source development model is extremely powerful because of the manner in which it allows innovations from multiple sources to be recombined and remixed in powerful ways. Consider the following examples...

Read More »

New Platfora Release With Open Data Access And Flexible Workflow Options Makes Big Data Analytics Available For All

Press Release | Platfora | May 14, 2014

Platfora today announced an update to its full-stack analytics platform with significant feature enhancements to tightly couple Big Data Analytics into core production workflows within the enterprise. 

Read More »

Open Source Among Top 10 Insurance Technology Trends in Health IT for 2016

Press Release | X by 2 | February 2, 2016

Healthcare technology is shaking things up faster than ever before. Whether it’s the quicker pace or technology-resistant providers, it’s crucial for leaders to stay educated and up-to-speed on the industry’s top developments. Here are 10 insurance technology trends that should be top of mind for 2016...Open-source will continue to make inroads: Microsoft's recent acceptance of open-source technologies such as Hadoop, Spark and D3.js in its DBMS and BI offerings is a clear indication that vendors are having a hard time keeping closed-source software competitive.

Read More »

Open Source Governance and the Rise of a New Open Health Movement

It's hard to tell if (or when) new open source foundations will appear and claim a leading role in healthcare. It would be interesting to see one created to scale an existing viable model, such as the one from Oroville Hospital using VistA. Or we could see OSEHRA shifting its focus and expanding its charter beyond just the US government space. Nevertheless, the successful foundation would keep a low barrier to entry for innovators, allowing them to incorporate and scale open source healthcare technologies into commercial products. Time will tell, but what's for certain is that we live in interesting times, and I am looking forward to massive innovation in healthcare in the near future. The time is ripe.

Read More »

Open Source Projects Are Transforming Machine Learning and AI

Machine learning and artificial intelligence have quickly gained traction with the public through applications such as Apple’s Siri and Microsoft’s Cortana. The true promise of these disciplines, though, extends far beyond simple speech recognition performed on our smartphones.  New, open source tools are arriving that can run on affordable hardware and allow individuals and small organizations to perform prodigious data crunching and predictive tasks.

Read More »

OpenShift Commons Gathering Event Preview

We're just two months out from the OpenShift Commons Gathering coming up on November 7, 2016 in Seattle, Washington, co-located with KubeCon and CloudNativeCon. OpenShift Origin is a distribution of Kubernetes optimized for continuous application development and multi-tenant deployment. Origin adds developer and operations-centric tools on top of Kubernetes to enable rapid application development, easy deployment and scaling, and long-term lifecycle maintenance for small and large teams. And we're excited to say, the 1.3 GA release of OpenShift Origin, which includes Kubernetes 1.3, is out the door! Hear more about the release from Lead Architect for OpenShift Origin, Clayton Coleman...

Why Data Scientists Love Kubernetes

Let's start with an uncontroversial point: Software developers and system operators love Kubernetes as a way to deploy and manage applications in Linux containers. Linux containers provide the foundation for reproducible builds and deployments, but Kubernetes and its ecosystem provide essential features that make containers great for running real applications...What you may not know is that Kubernetes also provides an unbeatable combination of features for working data scientists. The same features that streamline the software development workflow also support a data science workflow! To see why, let's first see what a data scientist's job looks like...

Read More »

WSO2 Founder and CEO to Unveil Latest Product Developments For Harnessing Today’s Connected World At WSO2Con US 2014

Press Release | WSO2 | October 28, 2014

In an increasingly connected world, enterprises are extending new business models, processes and services across their employees, customers and partners. In his keynote presentation at WSO2Con US 2014 today, Dr. Sanjiva Weerawarana, WSO2 founder and CEO, will discuss how competing in this connected world requires a new holistic approach to IT architecture that harnesses the combined power of the cloud, APIs mobile computing, Internet of Things (IoT), analytics, DevOps, and integration...

Read More »