r - Filtering for only complete sets of years -


i have data on yield organized state , county. out of data want retain counties providing complete years between 1970 2000.

the following code clears away incomplete cases, fails omit cases- larger data set. fake data

some fake data:

fake data

k <- 5 # number of rows set nan  df <- data.frame(state = c(rep(1, 10), rep(2, 10)),                  county = rep(1:4, 5), yield = 100)  df[sample(1:20, k), 3] <- nan 

current code:

df1 <- read.csv("gly2.csv",header=true)  df <- data.frame(df1)   droprows_1 <- function(df, v1, v2, v3, value = 'x'){   idx <- df[, v3] == value   todrop <- df[idx, c(v1, v2)]; todrop # should have k rows missng   todrop <- unique(todrop); todrop # unique values less    nrow <- dim(todrop)[1]   for(i in 1:nrow){     idx <- apply(df, 1, function(x) all(x == todrop[i, ]))     df <- df[!idx, ]   }   return(df) }  qq <- droprows_1(df, 1, 2, 3) 

thank you

to drop county's single missing value, use:

library(dplyr) df %>% group_by(county) %>% filter( !any(is.nan(yield))) 

Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -