Linkstream – Telegram
Linkstream
173 subscribers
32 photos
3 videos
2 files
899 links
Various links I find interesting. Mostly hardcore tech :) // by @oleksandr_now. See @notatky for the personal stuff
Download Telegram
Goods: Organizing Google’s Datasets '16
TLDR: Internal search engine operating over 26*10^9 *datasets* (e.g. DB tables), 5% daily churn, schema inference... interesting stuff
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45390.pdf
This common and unfortunate fact of the lack of an adequate presentation of basic ideas and motivations of almost any mathematical theory is, probably, due to the binary nature of mathematical perception: either you have no inkling of an idea or, once you have understood it, this very idea appears so embarrassingly obvious that you feel reluctant to say it aloud; moreover, once your mind switches from the state of darkness to the light, all memory of the dark state is erased and it becomes impossible to conceive the existence of another mind for which the idea appears nonobvious.
— Gromov

Source: M. Berger, Encounter with a geometer, Part II, Notices Amer. Math. Soc. 47 (2000), no. 3, 326--340.
via @atemerev
http://www.ams.org/notices/200003/fea-berger.pdf
Google support at its best
(TLDR: Firebase costs increased 70x because they started including SSL handshake in traffic calculation)
https://news.ycombinator.com/item?id=14356409
On the Turing Completeness of MS PowerPoint 😂
https://www.andrew.cmu.edu/user/twildenh/PowerPointTM/Paper.pdf
Neurosurgeon: Collaborative Intelligence
Between the Cloud and Mobile Edge
TLDR: It's best to split processing between device and cloud, here's how to do it automatically (for DNNs)
http://web.eecs.umich.edu/~jahausw/publications/kang2017neurosurgeon.pdf
DeepXplore: Automated Whitebox Testing of Deep Learning Systems '17
TLDR: Neuron coverage, use 3+ cross-referencing oracles to detect outliers
https://arxiv.org/pdf/1705.06640.pdf
> Design your construct, pick your deliverable - dsDNA, plasmid DNA or purified protein - and Serotiny will instantly price your order from various vendors.
https://serotiny.bio/notes/pinecone/
We live in interesting times :)
> Big data systems may scale well, but this can often be just because they introduce a lot of overhead
I believe I already posted this one, but somehow I can't find it in the history, so here it comes again
http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html
CNTK 2.0. Nice production-related features, note Model Compression, Keras support and impressive multi-GPU scaling.

> If your project involves sequence processing, such as speech, natural language understanding, machine translation, etc., CNTK should be your best choice. And if you are a vision researcher working on video processing, you should definitely give CNTK a try.

https://www.microsoft.com/en-us/cognitive-toolkit/blog/2017/06/microsofts-high-performance-open-source-deep-learning-toolkit-now-generally-available/
and now for something completely different
http://4dtoys.com/
Unicode nuances you probably don't need to know about :)
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
Exploring LSTMs - with pretty pictures
http://blog.echen.me/2017/05/30/exploring-lstms/