[Update: and another followup here]
Just as a followup to the preceding article on New Zealand Births and Friday the thirteenth. In discussion about that article the point came up of hospitals only scheduling c-sections during certain days of the week. I wanted to see if this could be seen in the data. If so, we would expect the trend to show up for lower for weekend days and higher for weekdays (and possibly high for particular weekdays if c-sections have tended to be scheduled for those days over time).
Again the data is the aggregated births by day of year for New Zealand 1980-2014, so again day of week is not present and needs to be inferred.
For every day in 1980 to 2014 inclusive, you can calculate how many times each day of the week has occurred. So, for example January 8th 1980-2014 has 4 Mondays, 6 Tuesdays and 5 of every other day of the week. Using this we can compare the summarised number of births for the group of days with 4 Mondays to the group of days with 5 Mondays to the group of days with 6 Mondays (and so on with Tuesdays, Wednesdays, etc).
As part of this I am deliberately excluding February 29th, as leap days happen very infrequently, along with December 25th and 26th, January 1st and 2nd, and February 6th as fixed public holidays when c-sections were not scheduled.
The big caveat on these numbers is that the standard deviation of each group is in the range of 150 to 200, so keep that in mind comparing the differences.
If we just check the step differences, rather than a formal ANOVA (to be honest, I don’t think an ANOVA would be appropriate- there are dependencies in the data since something with 4 Mondays by definition has 6 of another day), we get the 4,5,6 pattern of:
Keeping in mind the interdependence in the data (rather than independence) and the large standard deviations relative to the observed differences, I think we can say the data suggests scheduled c-sections are not preformed on a Sunday (as the more Sundays, the consistently fewer births) and are preformed on a Wednesday (as the more Wednesdays, the consistently more births). There are weaker relationships for Thursday and Saturday, which could be weaker relationships or more masking by the variation or dependence (just one of those things).
As ever, R code:
dts <- seq.Date(from=as.Date("1980-01-01"), to=as.Date("2014-12-31"), by=1) dys <- weekdays(dts) download.file("http://www.stats.govt.nz/~/media/Statistics/browse-categories/population/pop-birthdays-table/most-common-birthdays-19802014.xlsx", destfile="b.xlsx", mode="wb") library(readxl) bnums <- read_excel("b.xlsx", sheet = 2, skip=2) names(bnums) <- "day" library(tidyr) b3 <- gather(bnums, key="month", value="births", January:December) library(lubridate) mth <- month(dts, label=TRUE, abbr=FALSE) mdt <- mday(dts) alldates <- data.frame(dts,dys,mth, mdt) library(dplyr) alldates %>% group_by(mth,mdt) %>% summarise(Mondays = sum(dys == "Monday"), Tuesdays = sum(dys == "Tuesday"), Wednesdays = sum(dys == "Wednesday"), Thursdays = sum(dys == "Thursday"), Fridays = sum(dys == "Friday"), Saturdays = sum(dys == "Saturday"), Sundays = sum(dys == "Sunday")) -> aggWdays combinedData <- merge(b3, aggWdays, by.x=c("month","day"), by.y=c("mth","mdt")) # Exclude Feb 29, Dec 25 & 26, Jan 1 & 2 and rearrange data, then generate summary stats combinedData %>% filter(!(month=="January" & day==1), !(month=="January" & day==2), !(month=="February" & day==6), !(month=="February" & day==29), !(month=="December" & day==25), !(month=="December" & day==26)) %>% gather(key=DoW,value=dCount, Mondays:Sundays) %>% group_by(DoW,dCount) %>% summarise(meanbirths=mean(births), sd=sd(births), n=n()) -> sumStats library(xtable) print(xtable(sumStats), type = "html") combinedData %>% filter(!(month=="January" & day==1), !(month=="January" & day==2), !(month=="February" & day==6), !(month=="February" & day==29), !(month=="December" & day==25), !(month=="December" & day==26)) %>% gather(key=DoW,value=dCount, Mondays:Sundays) %>% group_by(DoW,dCount) %>% summarise(mean=mean(births)) %>% select(DoW, dCount, mean) %>% spread(key=dCount, value=mean) %>% mutate(fourTofive = `5` - `4`, fiveTosix = `6` - `5`) %>% select(DoW, fourTofive, fiveTosix) -> results library(xtable) print(xtable(results), type = "html")