Review: Clojure for Machine Learning (Ch 1-3)
Pack Publishing has asked me to review their new book, Clojure for Machine Learning (4/2014) by Akhil Wali. Interested both in Clojure and M.L., I have taken the challenge and want to share my impressions from the first chapters. Regarding my qualification, I am a medium-experienced Clojure developer and have briefly encountered some M.L. (regression etc. for quantitive sociological research and neural networks) at the university a decade ago, together with the related, now mostly forgotten, math such as matrices and derivation.In short, the book provides a good bird-eye view of the intersection of Clojure and Machine Learning, useful for people coming from both sides. It introduces a number of important methods and shows how to implement/use them in Clojure but does not - and cannot - provide deep understanding. If you are new to M.L. and really like to understand things like me, you want to get a proper textbook(s) to learn more about the methods and the math behind them and read it in parallel. If you know M.L. but are relatively new to Clojure, you want to skip all the M.L. parts you know and study the code examples and the tools used in them. To read it, you need only elementary knowledge of Clojure and need to be comfortable with math (if you haven't worked with matrices, statistics, or derivation and equations scare you, you will have a hard time with some of the methods). You will learn how to implement some M.L. methods using Clojure - but without deep understanding and without knowledge of their limitations and issues and without a good overview of alternatives and the ability to pick the best one for a particular case.
Continue reading →
Most interesting links of May '14
Recommended Readings
- Monolith - from The Codeless Code - fables and koans for the SW engineer - the Monad monolth #Haskell #fun
- http2 explained (pdf, 27 pages) - cons of http 1 (huge spec / no full impl., wasteful use of TCP <=> latency [x spriting, inlining, concatenation, sharding]) => make it less latency sensitive, fix pipelining (issue a req before previous one finished), stop the need for ever increasing # connections, remove/reduce optional parts of http. Http2 is binary; multiple "streams" over 1 connection => much less conns, faster data delivery; header/data compression; [predictive] resource pushing; . Inspired by SPDY. Chrome and Mozilla will only support it over TLS, yay! (see also Is TLS Fast Yet? [yes, it is]) Promise: faster, more responsive web pages & deprecation of http/1 workarounds => simplified web dev.
Continue reading →
Fixing clojurescript.test failing with "ReferenceError: Can't find variable: cemerick"
cemerick.cljs.test) may fail with this confusing exception:ReferenceError: Can't find variable: cemerickdue to couple of reasons:
- Your test namespaces do not require
cemerick.cljs.test(and thus it is missing from the compiled .js; requiring macros is not enough) - cljsbuild has not included any of your test files (due to wrong setup etc.; this is essentially another form of #1)
- You are trying to test with the node runner but have built with
:optimizations:noneor:whitespace(for node you need to concatenate everything into a single file, which only happens if you use:simpleor:advancedoptimizations)
Example failures from all the runners:
Continue reading →
Clojure/Java: Prevent Exceptions With "trace missing"
Continue reading →
ClojureScript/Om: Spurious "Minified exception occured" With Advanced Optimizations
:optimizations :advanced
Continue reading →
core.async: "Can't recur here" in ClojureScript but OK in Clojure
core.async "0.1.303.0-886421-alpha"
Continue reading →
Graphite Shows Metrics But No Data - Troubleshooting
Update: Graphite data gotchas that got me
(These gotchas explain why I did not see any data.)- Graphite shows aggregated, not raw data if the selected query period (24h by default) is greater than the retention period of the highest precision. F.ex. with the schema "1s:30m,1m:1d,5m:2y" you will see data at the 1s precision only if you select period less than or equal to the past 30 minutes. With the default one, you will see the 1-minute aggregates. This applies both to the UI and whisper-fetch.py.
- Aggregation drops data unless by default at least 50% of available data slots have values (xFilesFactor=0.5). I.e. if your app sends data at a rate more than twice slower than Graphite expects them, they will never show up in aggregates. F.ex. with the schema "1s:30m,1m:1d,5m:2y" you must sends data at least 30 times within a minute for them to show in an aggregate.
Lesson learned: Always send data to Graphite in *exactly* same rate as its highest resolution
As described above, if you send data less frequently than twice the highest precision (if 1s => send at least every 2s), aggregation will ignore the data, with the default xFilesFactor=0.5 (a.k.a. min 50% of values reqired factor). On the other hand, if you send data more frequently than the highest precision, only the last data point received in each of the highest precision periods is recorded, others ignored - that's why f.ex. statsD flush period must = Graphite period.Continue reading →
Most interesting links of April '14
Recommended Readings
- The economics of reuse - developing code for reuse costs much more than for one need - it might cost 300% more to develop and save you 75% of work when (re)using it instead of developing from scratch (if one of the factors goes down, the other one typically goes down too). Summary: "That means that to get any value from your reused component, you better have five or more reusers or you have to find a way to substantially improve the [reuse value factor] or [reusability cost factor]. Very smart people have failed to do this."
- Book in making: Reactive Design Patterns (1st ch free)
Continue reading →
Clojure: How To Prevent "Expected Map, Got Vector" And Similar Errors
I should mention that I of course write tests and experiment in the REPL but I still hit these problems so it is not enough for me. Tests cannot protect me from having a wrong model of the input data (since I write the [unit] tests based on the same assumptions as the code and only discover the error when I integrate all the bits) and even if they help to discover an error, it is still time-consuming the root cause.
Can I do better? I believe I can.
Continue reading →
How to create and run Gatling 2.0 tests
0. Create a project:
$ mvn archetype:generate \
-DarchetypeCatalog=http://repository.excilys.com/content/groups/public/archetype-catalog.xml
-Dfilter=io.gatling:
(The trailing ":" in the filter is important.)
1. Import to IntelliJ
In IntelliJ choose to import an object, instead of "from sources" select "from external model" and then Maven. You will also need to have the Scala plugin installed and, when imported, you will likely need to right-click on pom.xml and Maven - Reimport.2. Record a simulation
- Run the
src/test/scala/Recorder.scala(right-click - Run 'Recorder') - Set the port it should listen on, f.ex. 8000 (maybe you also need to set port for HTTPS, f.ex. 8001), set the target app (perhaps localhost, <some port>, <some https/dummy port>)
- Optionally set the class name of the recorded simulation and the target package (the output goes to
src/test/scala/<package>/<name>.scala) - Click [Start !]
- Go to your browser and configure it to use the recorder as its HTTP[s] proxy
- Browse localhost:8000/your_app as you want for the test
- Click [Stop and save] in the Recorder UI
Continue reading →
Kioo: How To Replace The Whole Body
Kioo, the enlive-inspired templating library for React.js and derived libs such as Om, normally works by matching selectors against elements inside
Dislaimer: This is a result of my experimentation, not deep knowledge if Kioo.
<body> and transforming the matched elements while also keeping all the other ones. But what if you want to keep just the single matched element and drop all the others? You need some tricks and I will demonstrate one possible way.Dislaimer: This is a result of my experimentation, not deep knowledge if Kioo.
Continue reading →
Kioo: How to Troubleshoot Template Processing
Continue reading →
Framework Joy: Load in Hibernate Updates Data
Continue reading →
Most interesting links of March '14
Recommended Readings
Clojure Corner
Continue reading →
How To Generate A Valid Credit Card Number For A Bin (First 6 Digits)
Continue reading →