<?xml version='1.0' encoding='utf-8' ?>

<rss version='2.0' xmlns:lj='http://www.livejournal.org/rss/lj/1.0/' xmlns:atom10='http://www.w3.org/2005/Atom'>
<channel>
  <title>Gustavo Lacerda</title>
  <link>https://gusl.dreamwidth.org/</link>
  <description>Gustavo Lacerda - Dreamwidth Studios</description>
  <lastBuildDate>Sat, 21 Apr 2012 16:57:43 GMT</lastBuildDate>
  <generator>LiveJournal / Dreamwidth Studios</generator>
  <lj:journal>gusl</lj:journal>
  <lj:journaltype>personal</lj:journaltype>
  <image>
    <url>https://v2.dreamwidth.org/690564/671338</url>
    <title>Gustavo Lacerda</title>
    <link>https://gusl.dreamwidth.org/</link>
    <width>100</width>
    <height>100</height>
  </image>

<item>
  <guid isPermaLink='true'>https://gusl.dreamwidth.org/1149621.html</guid>
  <pubDate>Sat, 21 Apr 2012 16:57:43 GMT</pubDate>
  <title>R: semantics and pragmatics / names vs values</title>
  <link>https://gusl.dreamwidth.org/1149621.html</link>
  <description>One of the annoyances in R is dealing with functions that don&apos;t evaluate one or more arguments that you pass, or who otherwise use the name of the variable passed.  The problem appears when you try to write abstractly.&lt;br /&gt;&lt;br /&gt;e.g. &lt;code&gt;with(data, ZQ/Total.Z)&lt;/code&gt; will compute &lt;code&gt;data$ZQ / data$Total.Z&lt;/code&gt; .  What &apos;with&apos; is doing is parsing that expression, figuring out which variable tokens are already present in the current environment, and putting &quot;data$&quot; in the front of the rest.  Yesterday, in my naivety, I implemented just that (28 easy-to-read lines of R).&lt;br /&gt;&lt;br /&gt;However, it&apos;s hard to do something more abstract, e.g. &lt;code&gt;with(data, property)&lt;/code&gt; will try to get a property named &quot;property&quot;.  To circumvent this, one can make a call to &lt;code&gt;eval&lt;/code&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;withExpr &amp;lt;- jPaste(&quot;with(x,&quot;,property,&quot;)&quot;)
eval(parse(text=withExpr))&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I am not happy with this, but there is NO OTHER WAY.  I say this confidently because &apos;with&apos; appears to completely discard the value of the variable passed, while only using its name, i.e. something like:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;property &amp;lt;- deparse(substitute(property))&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Having to call eval is the price we pay for the convenience of not using quotes.&lt;br /&gt;&lt;br /&gt;And, guess what, I take the deal!  Yesterday, I wrote &apos;violinPlot&apos;, which is like a &apos;boxplot&apos; but with kernel density estimates instead of quantiles.  The two basic arguments to violinPlot are &apos;datasets&apos; and &apos;property&apos;: for each dataset, it extracts the property and plots a violin.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;l &amp;lt;- list(mon, tue, wed, thu, fri, sat, sun)
violinPlot(l, ZQ/Total.Z, col=c(rep(&quot;#AAAAFF&quot;,5), rep(&quot;orange&quot;, 2)), horizontal=FALSE)&lt;/pre&gt;&lt;br /&gt;&lt;img src=&quot;http://dl.dropbox.com/u/4521346/violins.png&quot;&gt;&lt;br /&gt;My code starts with:&lt;br /&gt;&lt;pre&gt;
violinPlot &amp;lt;- function(datasets, property,
                       labels=c(&quot;M&quot;, &quot;T&quot;, &quot;W&quot;, &quot;R&quot;, &quot;F&quot;, &quot;Sa&quot;, &quot;Su&quot;),
                       horizontal=TRUE, colors=NA){
  property &amp;lt;- deparse(substitute(property))
  colors &amp;lt;- rep(colors, length(datasets)/length(colors)+1)
  densities &amp;lt;- lapply(datasets, function(x) density(with2(x,property)))
  ...
}

with2 &amp;lt;- function(data, expr, ...)
  with(data, eval(parse(text=expr)), ...)
&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;You can see above that I also wanted to pass &apos;property&apos; without quotes.  Having essentially reimplemented &apos;with&apos;, I am in a position to modify it so that the syntax becomes &lt;code&gt;with(data, &quot;ZQ/Total.Z&quot;)&lt;/code&gt;, and spare myself the eval next time... but I don&apos;t wanna.&lt;br /&gt;&lt;br /&gt;But here&apos;s what I might do: instead of &lt;code&gt;with(data,expr)&lt;/code&gt;, make it &lt;code&gt;with(data, exprLiteral=NULL, exprToEvaluate=NULL)&lt;/code&gt;, and you would only pass one of these expr arguments.  The difference is that &apos;exprToEvaluate&apos; gets evaluated into a string (so it better be a string!); whereas &apos;exprLiteral&apos; gets turned into a string directly, and corresponds to the current syntax of &apos;with&apos;... and since &apos;exprLiteral&apos; comes first (in the second position of the argument list), current calls to &apos;with&apos; would continue working.  Yay, backward-compatibility!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;More pretty graphics:&lt;span class=&quot;cut-wrapper&quot;&gt;&lt;span style=&quot;display: none;&quot; id=&quot;span-cuttag___1&quot; class=&quot;cuttag&quot;&gt;&lt;/span&gt;&lt;b class=&quot;cut-open&quot;&gt;(&amp;nbsp;&lt;/b&gt;&lt;b class=&quot;cut-text&quot;&gt;&lt;a href=&quot;https://gusl.dreamwidth.org/1149621.html#cutid1&quot;&gt;Read more...&lt;/a&gt;&lt;/b&gt;&lt;b class=&quot;cut-close&quot;&gt;&amp;nbsp;)&lt;/b&gt;&lt;/span&gt;&lt;div style=&quot;display: none;&quot; id=&quot;div-cuttag___1&quot; aria-live=&quot;assertive&quot;&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=gusl&amp;ditemid=1149621&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://gusl.dreamwidth.org/1149621.html</comments>
  <category>programming</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://gusl.dreamwidth.org/1147503.html</guid>
  <pubDate>Thu, 05 Apr 2012 03:45:40 GMT</pubDate>
  <title>what I love about R</title>
  <link>https://gusl.dreamwidth.org/1147503.html</link>
  <description>One thing I really love about R is how I can write improperly-scoped code, and everything still works.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;
gSmooth &amp;lt;- function(x,y, kernelSd=1, kernel=function(z) dnorm(z,mean=x[i],sd=kernelSd)){
  v &amp;lt;- c()
  for (i in seq_len(length(x))){
    weights &amp;lt;- sapply(x, kernel)
    v[i] &amp;lt;- sum(weights*y)/sum(weights)
  }
  list(x=x,y=v)
}

plot(data$ZQ, type=&quot;l&quot;, ylim=c(0,130))
ss &amp;lt;- gSmooth(1:n,data$ZQ)
pplot(ss$x, ss$y, type=&quot;l&quot;, col=&quot;red&quot;)
ss &amp;lt;- gSmooth(1:n,data$ZQ, kernelSd=3)
pplot(ss$x, ss$y, type=&quot;l&quot;, col=&quot;blue&quot;)
&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;http://dl.dropbox.com/u/4521346/smoothing.png&quot;&gt;&lt;br /&gt;&lt;br /&gt;----&lt;br /&gt;&lt;br /&gt;This is much cleaner:&lt;br /&gt;&lt;pre&gt;gSmooth &amp;lt;- function(x,y, kernel=gaussKernel){
  v &amp;lt;- c()
  for (i in seq_len(length(x))){
    center &amp;lt;- x[i]
    weights &amp;lt;- sapply(x, function(z) kernel(z, center))
    v[i] &amp;lt;- sum(weights*y)/sum(weights)
  }
  list(x=x,y=v)
}

gaussKernel &amp;lt;- function(z, center) dnorm(z, mean=center, sd=kernelSd)
emaKernel &amp;lt;- function(z, center) if(z&amp;lt;=center) return(exp((z-center)/kernelSd))
                                   else return(0) ## Exponential Moving Average

plot(data$dayNumber, data$ZQ, type=&quot;p&quot;)
pplot(data$dayNumber, data$ZQ, type=&quot;l&quot;)


kernelSd &amp;lt;- 3
ss &amp;lt;- gSmooth(data$dayNumber,data$ZQ)
pplot(ss$x,ss$y, type=&quot;l&quot;, col=&quot;red&quot;)

kernelSd &amp;lt;- 3
ss &amp;lt;- gSmooth(data$dayNumber,data$ZQ, kernel=emaKernel)
pplot(ss$x,ss$y, type=&quot;l&quot;, col=&quot;blue&quot;)
&lt;/pre&gt;&lt;br /&gt;&lt;img src=&quot;http://dl.dropbox.com/u/4521346/gauss-vs-ema.png&quot;&gt;&lt;br /&gt;&lt;br /&gt;Note how the Exponential Moving Average (in blue) is backward-looking, and less smooth than the Gaussian one (in red), even though these kernels, when viewed as distributions, have the same standard deviation of 3.&lt;br /&gt;&lt;br /&gt;I think that this is in part due to the Exponential kernel not being as smooth as the Gaussian one, but I also suspect that it weights the points less evenly.&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=gusl&amp;ditemid=1147503&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://gusl.dreamwidth.org/1147503.html</comments>
  <category>programming</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://gusl.dreamwidth.org/1086517.html</guid>
  <pubDate>Thu, 13 Jan 2011 19:08:33 GMT</pubDate>
  <title>in defense of R</title>
  <link>https://gusl.dreamwidth.org/1086517.html</link>
  <description>A lot of people in the field of machine learning like to trash R.  But as someone who comes from machine learning and who has programmed all his life, in many languages and paradigms, I have to say that R can be pretty pleasant to work with.  It&apos;s not very fast (supposedly much slower than Matlab on matrix computations, and a lot slower than C++); its commands are a bit quirky at first and many defaults are annoying (e.g. whitespace is the default separator); and there are plenty of imperfections and missing features (e.g. hashes).  And, there is no serious type system.&lt;br /&gt;&lt;br /&gt;However, I find that R readily accommodates my desire to reinvent the language, which makes me very happy.  Functions are first-class objects.  &lt;code&gt;apply&lt;/code&gt; and &lt;code&gt;Reduce&lt;/code&gt; often spare me from writing looping code.  We have &lt;code&gt;eval&lt;/code&gt;!  Although there is no &lt;code&gt;defmacro&lt;/code&gt;, a lot can be accomplished with &lt;code&gt;deparse&lt;/code&gt; and &lt;code&gt;substitute&lt;/code&gt; (to be honest, I have yet to do any serious macro-ing). In function calls, &quot;all remaining arguments&quot; bind to &apos;&lt;code&gt;...&lt;/code&gt;&apos;.  The source code is within easy reach, in case you ever wonder how e.g. &lt;code&gt;plot&lt;/code&gt; implements its default axes labels.  &lt;code&gt;do-while&lt;/code&gt; has a substitute in the form of &lt;code&gt;repeat; if T then break&lt;/code&gt; (a.k.a. where &lt;code&gt;repeat&lt;/code&gt; is the same as &lt;code&gt;while(TRUE)&lt;/code&gt;).&lt;br /&gt;&lt;br /&gt;---&lt;br /&gt;&lt;br /&gt;Anyway, I have produced a substantial library for myself, and almost everything I do nowadays depends on it. Since I think this code could be useful for a lot of people (my debugging function, in particular), I should release a package of general-purpose R goodies someday.&lt;br /&gt;&lt;br /&gt;Today I&apos;m addressing the annoyance of having to remember parameter values, and pass them again and again to the different distribution-specific functions (e.g., in the case of the normal distribution, the set &lt;code&gt;pnorm&lt;/code&gt;,&lt;code&gt;qnorm&lt;/code&gt;,&lt;code&gt;rnorm&lt;/code&gt;,&lt;code&gt;dnorm&lt;/code&gt;).  This code bundles together the 4 distributions functions for any given distribution:&lt;br /&gt;&lt;span class=&quot;cut-wrapper&quot;&gt;&lt;span style=&quot;display: none;&quot; id=&quot;span-cuttag___1&quot; class=&quot;cuttag&quot;&gt;&lt;/span&gt;&lt;b class=&quot;cut-open&quot;&gt;(&amp;nbsp;&lt;/b&gt;&lt;b class=&quot;cut-text&quot;&gt;&lt;a href=&quot;https://gusl.dreamwidth.org/1086517.html#cutid1&quot;&gt;Read more...&lt;/a&gt;&lt;/b&gt;&lt;b class=&quot;cut-close&quot;&gt;&amp;nbsp;)&lt;/b&gt;&lt;/span&gt;&lt;div style=&quot;display: none;&quot; id=&quot;div-cuttag___1&quot; aria-live=&quot;assertive&quot;&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=gusl&amp;ditemid=1086517&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://gusl.dreamwidth.org/1086517.html</comments>
  <category>programming</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
</channel>
</rss>
