Funding Sources for Covid Research

repoRter.nih: a convenient R interface to the NIH RePORTER Project API

Introduction

The US National Institute of Health (NIH) received funding of approximately $42 billion in fiscal year 2022; $31 billion (72%) of this was awarded by the NIH in the form of research grant funding to hospitals, medical colleges, non-profits, businesses, and other organizations based in the U.S. and abroad.[https://nexus.od.nih.gov/all/2021/04/21/fy-2020-by-the-numbers-extramural-investments-in-research] The NIH maintains a publicly available database called “RePORTER” to track this substantial flow of grant funding and makes it available to the public via a web-based query interface as well as an API.

Read more

A meditation for technical geniuses

A quick diversion from studying to comment on a phenomenon I’ve witnessed at least a few times.

Imagine a brilliant technical prodigy, fluent in several programming languages, adept at statistical inference, capable with all manner of machine- and deep-learning procedures. They don’t work for google, but they are still pretty good. You ask them for help figuring something out in excel, or pulling together a report from a SQL database, which ends up being…beneath their abilities… Read more

Feel-good conspiracy theories

I’m a bit dismayed with the quality of the news reporting I’m seeing over the past few days. I’m talking about that David-and-Goliath story featuring a gritty, if not mettlesome, band of small-time retail day-traders coming together on reddit to take down the greedy hedge fund billionaire shorts who’ve until now gotten away with colluding to screw over the “little guy” and all other vague sorts of evil deeds probably. Read more

Bayesian Model Based Optimization in R

Bayesian Model Based Optimization in R

Model-based optimization (MBO) is a smart approach to tuning the hyperparameters of machine learning algorithms with less CPU time and manual effort than standard grid search approaches. The core idea behind MBO is to directly evaluate fewer points within a hyperparameter space, and to instead use a “surrogate model” which estimates what the result of your objective function would be in new locations by interpolating (not linearly) between the observed results in a small sample of initial evaluations. Many methods can be used to construct the surrogate model. This post will focus on implementing the bayesian method of Gaussian Process (GP) smoothing (aka “kriging”) which is borrowed from – and particularly well-suited to – spatial applications.

Read more

GAMs and scams: Part 1

GAMs and scams: Part 1

People who do statistical modeling for insurance applications usually know their way around a GLM pretty well. In pricing applications, GLMs can produce a reasonable model to serve as the basis of a rating plan, but in my experience they are usually followed by a round of “selections” – a process to incorporate business considerations and adjust the “indicated” rate relativities to arrive at those which will be implemented (and filed with regulators, if required). Selections can be driven by various constraints and considerations external to the data such as:

Read more

Apps for Hartford – haRtisan

Apps for Hartford – haRtisan

Happy 2021. Volunteering is probably related to the meaning of life somehow, at least indirectly. Part of the reason I decided to stand up this site was to have a means of publishing tools that will be useful to the Hartford residents who spend countless hours volunteering for committees, city commissions, NRZs, showing up for public comment periods, or just picking up trash on their block. Read more

A Great Store of Codfish

A Great Store of Codfish

English explorers were a trifle on-the-nose with their naming conventions in the early 1600s when Bartholemew Gosnold gave the name “Cape Cod” to the sandy, hooked peninsula in southeastern Massachussets after pulling in a big haul of cod there. I went with other intentions back in September (2020) and am now getting around to writing about it. Read more