Evaluating long context code understanding of large language models