Introduction
As some of the clients are trying to move from Github to Jira there is a need to migrate the relevant data which they have in Github. This data may include lot of entities which need to be migrate from source system that is Github in our case to target system which would be Jira cloud. These two systems belong to two different ecosystems and they do no provide much scope for migration out of the box between themselves.
The aim of this documents is to identify the different entities from Github which needs to be migrated to Jira cloud, decide on the steps of migration and also to give a report of entities which were imported successfully(or failed).
Note: Throughout the document Jira refers to Jira Cloud
Requirements
We have to pull the data out of Github and push it to Jira. We have identified the entities from Github which needs to be migrated to Jira. They are as follows,
Sl No# | Github Entity | Description |
---|---|---|
1 | Issues | Issues can be bugs, enhancements, change requests or any other requests related to the repository |
2 | Comments | Its a thread of discussion on a issues. Each issue can have multiple comments and each comment can have multiple attachments in it. |
3 | Attachments | Attachments can be attached to comments only and the contents of the description are marked as first comment on the issue |
4 | Assignee | User account to which the issue is assigned |
5 | Projects | Github also has somethings called as projects which at first may look synonymous to projects in Jira but is is not. A single Github repository can have multiple projects and projects in Github are similar to boards in Jira. The confusion may arise due to the naming convention used over here. |
6 | Milestones | These are similar to sprints in Jira with their own start and end date |
7 | Labels | These are similar to labels in JIRA(multi select control) |
8 | Users | Users are nothing but accounts to which issues can be assigned |
Current State of the Github Account
Sl No# | Entity | Count | Details |
---|---|---|---|
1 | Repository | 1 | Name - SiftScience/code |
2 | Issues | 16,801 | 2861(Open Issues), 13940 (Closed Issues) |
3 | Projects | 8 | Set up Digicert access for sre@, |
4 | Milestones | 174 | 51(open), 123(closed) |
5 | Labels | 116 | |
6 | Users | 31 | Users with “sift” in their username |
Prerequisites for migration
Following are the prerequisites required for migration,
User account with full admin access for APIs and data download from Github
User account with full admin access for pushing data to JIRA
Separate system on which we can login and work(This migration involves client data so we have to use a cloud instance rather than our local machine)
Confirmation of repository/repositories to be migrated
Confirmation of users to be migrated(Client has to confirm whether the user accounts should be created in Jira or not)
Confirmation of the data to be migrated as whole(all the entities ex. repository, issues, labels, milestones ….)
Approaches for migration
Pulling data from Github - Basically there are two ways to pull the data out of Github
First Way - Using the export option in the UI
Way to pull data | Using the export option in the UI |
Details | There is an option to export the Github data in the account settings page. The data is exported in the compressed format. The archive will contain your profile data, plan, and any email addresses connected with your account in addition to the issues, pull requests, comments, reviews, releases, projects, events, attachments, milestones, settings and much more for each of your repositories along with basic information about the users who have interacted with them. This export should be done using the account with admin privileges so that you get all the data exported in the compressed format |
Pros | It is easier to execute and requires less effort. |
Cons | There are no ids(unique identifiers) available in the exported json data so we may not be able to cross verify with Github after the migration of data ends and in case if we have few failures. |
Second Way - Using the API to make the request and fetch the data
Way to pull data | Using developer API |
Details | Github data can be perceived in terms of the follow hierarchy, You have organization at the top followed by repositories. Each repository has issues, comments and attachments. There are labels which can cut across different issue belonging to different repositories. Milestones belong to repository and there could be organization wide projects too shared by multiple repositories. |
Pros | We can generate a report and cross verify the migrated data manually. We can hand over this report with ids of each entity to client so that they can cross verify at their end too. |
Cons | Extra effort is required to implement such a bridge to pull data from Github and generate a CSV file that Jira can understand |
Apis required for migration |
|
Third Way (Hybrid Approach) - Using the manual export to fetch the data(Used to get attachment details) and using the api approach to fetch other details(Other than attachments)
Way to pull data | Using Organization data export and API |
Details | We can use best of the above two approaches to get the work done. In this approach we can export the organization data using the manual exports (or organization migration api) to get hold of the attachments attached to issues(we cannot get attachments using apis) and then fetch the other information about the organization like repositories, issues, comments, labels, milestones, releases, projects etc using the relevant github apis |
Pros | We can download all the data for an organization using this approach |
Cons | Export option in github UI for any user account does not give the ability to export organization data. It only allows us to download the data which is belongs to that account and not to any organization. |
High level steps for pulling the data |
|
Process of Migration
Download the data from Github (using either of the above three methods)
Transform the data by processing it
Write the transformed data to a csv file in a format understood by Jira
Upload the CSV data to Jira server and check if all the data was imported properly or not(We execute this step to make sure that we do not mess up cloud instance by directly importing data to cloud. We can fix server easily as we have more control over it)
If the CSV data was properly imported to Jira server than upload the same data to Jira cloud
Importing Data in JIRA
Data can be imported in Jira cloud using the CSV format. We can refer to the following link to generate the data in CSV format for both data and attachments.
Jira server - https://confluence.atlassian.com/adminjiraserver087/importing-data-from-csv-998872306.html
Jira Cloud - https://confluence.atlassian.com/adminjiracloud/importing-data-from-csv-776636762.html
Note : For importing attachments it is required to have the attachment data available over http/https so that it is accessible to Jira server/Cloud directly
Task Breakdown and Estimation
Sl No | Task | Estimation in (man hours) | Comment |
---|---|---|---|
1 | Get credentials for admin for Github from client | NA | Action required from Client |
2 | Get separate VM for development work | NA | Action required from Client |
3 | There should be a mapping document to show which field in Github map to which field Jira | ||
4 | Installations on development environment & testing of Github account credentials | ||
5 | There should be a way for the utility to pull data from Github | ||
6 | There should be a way for the utility to transform the data from Github format to Jira understandable format | ||
7 | There should be a way for the utility to export the transformed data to CSV format understandable by JIRA | ||
8 | There should be a way for the utility to generate report of data pulled from Github | ||
9 | There should be a way for the utility to generate report of data exported to CSV format | ||
10 | There should be a way for the utility to check the data in Jira cloud using api calls and generate report of data exported | ||
11 | Testing |
Doubts and Clarifications
Question | Details | Clarification |
---|---|---|
Do we need to migrate pull request details? | Github repositories also have pull requests marked raised by users. if we have to migrate them then they have to be migrated to Bitbucket cloud | |
Do we need to migrate the closed issues? | There are some closed issue present in the repository which somebody had already worked upon and have resolved them so do we have to migrate them also? | |
Do we have to create epics in Jira cloud for issues marked with label ‘epic’? | There are some issue with labels 'Epic' against them so do we have to create epic in Jira cloud for such issues? | |
Do we have to migrate users also? | There are users in the SiftScience accounts so Do we have to migrate these users too to Jira cloud? | |
What kind of ticket should we create in Jira for each issue in Github? | Github has only one issuetype called as issues and Jira has a separate field called as issuetypes, So how should we identify which one is a bug, task, story or any other type? | |
Does client uses Jira cloud or after github migration they will start using it? | Wanted to understand whether Jira cloud is having data/configuration at present or not. | |
Do we have list of labels that we need to support for migration? |
Tasks for POC
Task | Details | Aim | Status |
---|---|---|---|
Migrate : 10 Closed and 10 Open issues Migrate: 5 Milestones and 5 project with attachments | In Gihub, create two four milestones and add few issues under each | To check if we can create closed sprints in Jira or not | |
Migrate User data | In Github invite some users to join the organization | To check if we have enough information to create a user in Jira from Github | |
Migrate Comments | In Github create comments under different issues. | To check if we can migrate the comments under a project and a issue or not | |
Migrate Attachments | In github create an attachment, Migrate this attachment to JIRA | To check if we can migrate the attachment to proper issue or not | |
Check the release version | Try to migrate releases from Github to Jira | To see if we can migrate releases data or not |