Goods: Organizing Google’s Datasets '16
TLDR: Internal search engine operating over 26*10^9 *datasets* (e.g. DB tables), 5% daily churn, schema inference... interesting stuff
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45390.pdf
TLDR: Internal search engine operating over 26*10^9 *datasets* (e.g. DB tables), 5% daily churn, schema inference... interesting stuff
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45390.pdf
This common and unfortunate fact of the lack of an adequate presentation of basic ideas and motivations of almost any mathematical theory is, probably, due to the binary nature of mathematical perception: either you have no inkling of an idea or, once you have understood it, this very idea appears so embarrassingly obvious that you feel reluctant to say it aloud; moreover, once your mind switches from the state of darkness to the light, all memory of the dark state is erased and it becomes impossible to conceive the existence of another mind for which the idea appears nonobvious.
— Gromov
Source: M. Berger, Encounter with a geometer, Part II, Notices Amer. Math. Soc. 47 (2000), no. 3, 326--340.
via @atemerev
http://www.ams.org/notices/200003/fea-berger.pdf
— Gromov
Source: M. Berger, Encounter with a geometer, Part II, Notices Amer. Math. Soc. 47 (2000), no. 3, 326--340.
via @atemerev
http://www.ams.org/notices/200003/fea-berger.pdf
Google support at its best
(TLDR: Firebase costs increased 70x because they started including SSL handshake in traffic calculation)
https://news.ycombinator.com/item?id=14356409
(TLDR: Firebase costs increased 70x because they started including SSL handshake in traffic calculation)
https://news.ycombinator.com/item?id=14356409
On the Turing Completeness of MS PowerPoint 😂
https://www.andrew.cmu.edu/user/twildenh/PowerPointTM/Paper.pdf
https://www.andrew.cmu.edu/user/twildenh/PowerPointTM/Paper.pdf
Neurosurgeon: Collaborative Intelligence
Between the Cloud and Mobile Edge
TLDR: It's best to split processing between device and cloud, here's how to do it automatically (for DNNs)
http://web.eecs.umich.edu/~jahausw/publications/kang2017neurosurgeon.pdf
Between the Cloud and Mobile Edge
TLDR: It's best to split processing between device and cloud, here's how to do it automatically (for DNNs)
http://web.eecs.umich.edu/~jahausw/publications/kang2017neurosurgeon.pdf
DeepXplore: Automated Whitebox Testing of Deep Learning Systems '17
TLDR: Neuron coverage, use 3+ cross-referencing oracles to detect outliers
https://arxiv.org/pdf/1705.06640.pdf
TLDR: Neuron coverage, use 3+ cross-referencing oracles to detect outliers
https://arxiv.org/pdf/1705.06640.pdf
> Design your construct, pick your deliverable - dsDNA, plasmid DNA or purified protein - and Serotiny will instantly price your order from various vendors.
https://serotiny.bio/notes/pinecone/
We live in interesting times :)
https://serotiny.bio/notes/pinecone/
We live in interesting times :)
> Big data systems may scale well, but this can often be just because they introduce a lot of overhead
I believe I already posted this one, but somehow I can't find it in the history, so here it comes again
http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html
I believe I already posted this one, but somehow I can't find it in the history, so here it comes again
http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html
www.frankmcsherry.org
Scalability! But at what COST?
Michael Isard, Derek Murray, and I recently sent in a HotOS submission (it’s not blind, so no harm talking about it, we think). The subject is hinted at from...
CNTK 2.0. Nice production-related features, note Model Compression, Keras support and impressive multi-GPU scaling.
> If your project involves sequence processing, such as speech, natural language understanding, machine translation, etc., CNTK should be your best choice. And if you are a vision researcher working on video processing, you should definitely give CNTK a try.
https://www.microsoft.com/en-us/cognitive-toolkit/blog/2017/06/microsofts-high-performance-open-source-deep-learning-toolkit-now-generally-available/
> If your project involves sequence processing, such as speech, natural language understanding, machine translation, etc., CNTK should be your best choice. And if you are a vision researcher working on video processing, you should definitely give CNTK a try.
https://www.microsoft.com/en-us/cognitive-toolkit/blog/2017/06/microsofts-high-performance-open-source-deep-learning-toolkit-now-generally-available/
Microsoft Cognitive Toolkit
Microsoft’s high-performance, open source, deep learning toolkit is now generally available - Microsoft Cognitive Toolkit
Microsoft Cognitive Toolkit version 2.0 is now in full release with general availability. Cognitive Toolkit enables enterprise-ready, production-grade AI by allowing users to create, train, and evaluate their own neural networks that can then scale efficiently…
Unicode nuances you probably don't need to know about :)
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
Chronology of Math
http://www-history.mcs.st-andrews.ac.uk/Chronology/full.html
http://www-history.mcs.st-andrews.ac.uk/Chronology/full.html
Volume Sliders by Reddit
https://www.reddit.com/r/ProgrammerHumor/search?q=volume&restrict_sr=on
https://www.reddit.com/r/ProgrammerHumor/search?q=volume&restrict_sr=on
Exploring LSTMs - with pretty pictures
http://blog.echen.me/2017/05/30/exploring-lstms/
http://blog.echen.me/2017/05/30/exploring-lstms/
the ever-important stuff
http://opentrannoscripts.org/trannoscript/coming-war-general-computation/
http://opentrannoscripts.org/trannoscript/coming-war-general-computation/
Open Trannoscripts
The Coming War on General Computation - Cory Doctorow | Open Trannoscripts
The general shape of the copyright wars and the lessons they can teach us about the upcoming fights over the destiny of the general purpose computer are important.
not only nice visualizations, but an entertaining stories too!
http://quillette.com/2017/05/26/paradoxes-probability-statistical-strangeness/
http://quillette.com/2017/05/26/paradoxes-probability-statistical-strangeness/
Quillette
Paradoxes of Probability and Other Statistical Strangeness
You don’t have to wait long to see a headline proclaiming that some food or behaviour is associated with either an increased or a decreased health risk, or often both. How can it be that seemingly rigorous scientific studies can produce opposite conclusions?…
Even considering how much I generally dislike Linux — I'm very excited about userfaultfd and possibilities it provides to developers
https://lwn.net/Articles/718198/
https://lwn.net/Articles/718198/
lwn.net
The next steps for userfaultfd()
The userfaultfd() system call
allows user space to intervene in the handling of page faults. As Andrea
Arcangeli and Mike Rapoport described in a 2017 Linux Storage, Filesystem,
and Memory-Management Summit session dedicated to the subject,
userfaultfd()…
allows user space to intervene in the handling of page faults. As Andrea
Arcangeli and Mike Rapoport described in a 2017 Linux Storage, Filesystem,
and Memory-Management Summit session dedicated to the subject,
userfaultfd()…
Oh, the Dutch-lobbied movement is gaining traction! Meanwhile #scihub :)
http://www.sciencemag.org/news/2016/05/dramatic-statement-european-leaders-call-immediate-open-access-all-scientific-papers
http://www.sciencemag.org/news/2016/05/dramatic-statement-european-leaders-call-immediate-open-access-all-scientific-papers
Science
In dramatic statement, European leaders call for ‘immediate’ open access to all scientific papers by 2020
Observers are skeptical goal can be achieved