GitHub Copilot – GPT-3 powered coding
GitHub Copilot is in technical preview, a new AI pair programmer to suggest line completions and functions as you type.
I co-authored a blog for Avanade Techs and Specs, with a particular focus on GitHub Copilot.
Nearly a year ago, we investigated several intelligent code creation tools, and discussed the ethical implications and investigated a few tools – and wow, a lot can change in a year!
GitHub Copilot is now in technical preview, and with our hands on it, we think it deserves a discussion of its own. Copilot is a new software tool in the Intelligent Code Creation (ICC) category, using powerful machine learning models from OpenAI, including GPT-3 and Codex, to develop significantly faster and more efficiently.
GitHub Copilot is supported for use on Visual Studio Code, JetBrains variants (including Rider and IntelliJ), and Neovim.
Copilot can autosuggest solutions for the context you’re working in, but it can also understand natural human language in comments and function names, to synthesize a solution for your current task.
Copilot is trained from software documents, and public code repositories, whilst also utilising your own code to improve its suggestions, in line with your coding style; supplying highly relevant suggestions based on what you are currently working on.
Code generation from natural language
Thanks to OpenAI Codex, GitHub Copilot has the incredible ability to turn natural language into code. Write the logic you wish to achieve in a comment, and Copilot will try and create that functionality with code.
This can lead to one or more suggestions, including those spanning multiple lines of logic, making Copilot an excellent tool for existing developers to improve their productivity.
Learning to code
Developers that are learning to code will find that Copilot is a useful learning tool, giving developers the ability to play with code, using natural language to synthesize well written and formatted code suggestions, rather than heavy reliance on copy/paste, documentation, code samples, and tutorials.
This should make it faster to implement working solutions quickly, instead of spending time unravelling the sample code to find out what has gone wrong from the version in your clipboard; although the human ability to discern and understand will remain important.
Removing the bugbear of boilerplate... but what’s the cost?
Copilot can generate repetitive language in code, like quickly generating lists of sample data for testing.
Beyond the benefits of a system trained on large public code repositories, there are issues to be aware of. When code suggestions are based on popularity, or usage, you can’t guarantee that the code suggestions are using the most up to date implementations, libraries, and approaches.
If a new security flaw is found, Copilot may generate code without the correct fixes applied; and with such a vast training set, there is no sole source of truth or ‘correctness’ and other people’s bad coding habits may blindly end up in your code base. In one example we tested, we were even given someone else’s API key in a hard coded REST request. Oops!
One current pitfall is where new libraries or frameworks with significant breaking changes are released. Since the systems are trained on existing code repositories, it will take a significant amount of time for the available code to reflect the new versions. This may lead developers who aren’t in the know to continue developing for outdated versions instead of using the latest and greatest available to them.
GitHub Copilot is an incredible update to the AI pair programming world - and it looks set to continue into the future, with another feature on the horizon including code explanation of highlighted code blocks.
There's been some controversy on the internet around the use of public code - but with the rise of Tabnine, Kite, and others - the trend looks set to continue for now.