# Introduction

The US National Institute of Health (NIH) received funding of approximately $42 billion in fiscal year 2022;$31 billion (72%) of this was awarded by the NIH in the form of research grant funding to hospitals, medical colleges, non-profits, businesses, and other organizations based in the U.S. and abroad.[https://nexus.od.nih.gov/all/2021/04/21/fy-2020-by-the-numbers-extramural-investments-in-research] The NIH maintains a publicly available database called “RePORTER” to track this substantial flow of grant funding and makes it available to the public via a web-based query interface as well as an API.

## Bayesian Model Based Optimization in R

Model-based optimization (MBO) is a smart approach to tuning the hyperparameters of machine learning algorithms with less CPU time and manual effort than standard grid search approaches. The core idea behind MBO is to directly evaluate fewer points within a hyperparameter space, and to instead use a “surrogate model” which estimates what the result of your objective function would be in new locations by interpolating (not linearly) between the observed results in a small sample of initial evaluations. Many methods can be used to construct the surrogate model. This post will focus on implementing the bayesian method of Gaussian Process (GP) smoothing (aka “kriging”) which is borrowed from – and particularly well-suited to – spatial applications.

## GAMs and scams: Part 1

People who do statistical modeling for insurance applications usually know their way around a GLM pretty well. In pricing applications, GLMs can produce a reasonable model to serve as the basis of a rating plan, but in my experience they are usually followed by a round of “selections” – a process to incorporate business considerations and adjust the “indicated” rate relativities to arrive at those which will be implemented (and filed with regulators, if required). Selections can be driven by various constraints and considerations external to the data such as: