50 years of Data Science article by David Donoho, Sept. 18, 2015


The question IBM had for itself was is it possible to build a computer system that could process big data and come up with sensible answers in seconds—so well that it could compete with human opponents?  What is Watson, really?

IBM Watson Jeopardy Full Episode (Day 1) and more youtube movies

Computer says “try this” by The Economist

 IBM Watson Analytics

More at RMLA Watson

"Big data is a term applied to data sets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low-latency. And it has one or more of the following characteristics – high volume, high velocity, or high variety. Big data comes from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media - much of it generated in real time and in a very large scale." From IBM

 "Big Data research success is often contingent on access to the newest, most advanced, and often expensive hardware systems and the expertise needed to build and implement such systems."
"This article identifies several notable and popular Big Data technologies typically implemented using large and extremely powerful cloud-based systems and investigates the feasibility and utility of development of Big Data analytics systems implemented using low-cost commodity hardware in basic and easily maintainable configurations for use within academic social research."

Big Euclid at the Library of Alexandria  

                              A Library of Classical Statistics for big data

Euclid Library of Alexandria

Research Methods Library of Alexandria(RMLA)   Mirror of Research Methods Library of Alexandria   Dr.Serageldin Course     Research Methods Supercourse(home)

During the past few years there has been a revolution in computing. It has been easier and less expensive to collect bigger and bigger data sets. What has sprung up is a field called “Big Data”. Big data consists of storage, retrieval and analytics of data sets that were not imaginable a decade before. One aspect that has received little attention is the application of classical statistics to “big data”. In this web site, which we call “Big Euclid” we provide background materials for the statistical analysis of big data. It links closely with the Euclid Research Methods Library of Alexandria which is the largest an most comprehensive research methods library on the web.

BIG EUCLID, a site for statistical analysis for big data

We have just begun to discuss the establishment of Big Euclid as part of the Euclid library. This would be a library of big data methods that anyone can use for free. We are collecting lectures and materials related to Big Data Statistical approaches


1. Library of Alexandria Science Supercourse lectures on big data - http://ssc.bibalex.org/search/list.jsf?term=Big+Data     as     Big Data And Analytics Challenges and Issues lecture, Data Mining and Big Data lecture and more              

2. Statistics "big data" lectures on web    -    search for statistics "big data" site:.edu filetype:ppt 224 lectures on Sept 20, 2016   as  Data Science  by Jon Kettenring     Introduction to Big Data lecture lecture and more

3. Statictics "big data" videos on web -  Classical Statistics in a Big Data World by Dan McClary, TOP 10 BIG DATA VIDEOS ON YOUTUBE, Noam Chomsky on "Big Data" (2015) videos and more

What already built?       

1. Million scientists network and  big data grid. Access to the major leaders of classical statistics                                                                             

2. Euclid Research Methods Library of Alexandria - the largest research methods site and the largest statistical learning materials.     

3. Apps for Academics: Apps for Mobile Phones in Big Data Searches in Big Data by African Country Searches in Big Data Analytics by African Country

4. Discussion of Article: Clash of Paradigms:  Statistics and Big Data

Next steps

1. The first step would be to inform statistical experts and form grid of experts in big data.                                                        

2. The second would be to collect statistics and big data .ppt and videos.