# Introduction

The US National Institute of Health (NIH) received funding of approximately $42 billion in fiscal year 2022;$31 billion (72%) of this was awarded by the NIH in the form of research grant funding to hospitals, medical colleges, non-profits, businesses, and other organizations based in the U.S. and abroad.[https://nexus.od.nih.gov/all/2021/04/21/fy-2020-by-the-numbers-extramural-investments-in-research] The NIH maintains a publicly available database called “RePORTER” to track this substantial flow of grant funding and makes it available to the public via a web-based query interface as well as an API.

## A meditation for technical geniuses

A quick diversion from studying to comment on a phenomenon I’ve witnessed at least a few times.

Imagine a brilliant technical prodigy, fluent in several programming languages, adept at statistical inference, capable with all manner of machine- and deep-learning procedures. They don’t work for google, but they are still pretty good. You ask them for help figuring something out in excel, or pulling together a report from a SQL database, which ends up being…beneath their abilities… Read more

## Bayesian Model Based Optimization in R

Model-based optimization (MBO) is a smart approach to tuning the hyperparameters of machine learning algorithms with less CPU time and manual effort than standard grid search approaches. The core idea behind MBO is to directly evaluate fewer points within a hyperparameter space, and to instead use a “surrogate model” which estimates what the result of your objective function would be in new locations by interpolating (not linearly) between the observed results in a small sample of initial evaluations. Many methods can be used to construct the surrogate model. This post will focus on implementing the bayesian method of Gaussian Process (GP) smoothing (aka “kriging”) which is borrowed from – and particularly well-suited to – spatial applications.

## GAMs and scams: Part 1

People who do statistical modeling for insurance applications usually know their way around a GLM pretty well. In pricing applications, GLMs can produce a reasonable model to serve as the basis of a rating plan, but in my experience they are usually followed by a round of “selections” – a process to incorporate business considerations and adjust the “indicated” rate relativities to arrive at those which will be implemented (and filed with regulators, if required). Selections can be driven by various constraints and considerations external to the data such as:

## Apps for Hartford – haRtisan

Happy 2021. Volunteering is probably related to the meaning of life somehow, at least indirectly. Part of the reason I decided to stand up this site was to have a means of publishing tools that will be useful to the Hartford residents who spend countless hours volunteering for committees, city commissions, NRZs, showing up for public comment periods, or just picking up trash on their block. Read more