/
10. Requirement Analysis/Scoping

10. Requirement Analysis/Scoping

Introduction

As some of the clients are trying to move from Github to Jira there is a need to migrate the relevant data which they have in Github. This data may include lot of entities which need to be migrate from source system that is Github in our case to target system which would be Jira cloud. These two systems belong to two different ecosystems and they do no provide much scope for migration out of the box between themselves.

The aim of this documents is to identify the different entities from Github which needs to be migrated to Jira cloud, decide on the steps of migration and also to give a report of entities which were imported successfully(or failed).

Note: Throughout the document Jira refers to Jira Cloud

Requirements

We have to pull the data out of Github and push it to Jira. We have identified the entities from Github which needs to be migrated to Jira. They are as follows,

Sl No#

Github Entity

Description

Sl No#

Github Entity

Description

1

Issues

Issues can be bugs, enhancements, change requests or any other requests related to the repository

2

Comments

Its a thread of discussion on a issues. Each issue can have multiple comments and each comment can have multiple attachments in it.

3

Attachments

Attachments can be attached to comments only and the contents of the description are marked as first comment on the issue

4

Assignee

User account to which the issue is assigned

5

Projects

Github also has somethings called as projects which at first may look synonymous to projects in Jira but is is not. A single Github repository can have multiple projects and projects in Github are similar to boards in Jira. The confusion may arise due to the naming convention used over here.

6

Milestones

These are similar to sprints in Jira with their own start and end date

7

Labels

These are similar to labels in JIRA(multi select control)

8

Users

Users are nothing but accounts to which issues can be assigned

 

Current State of the Github Account

Sl No#

Entity

Count

Details

Sl No#

Entity

Count

Details

1

Repository

1

Name - SiftScience/code

2

Issues

16,801

2861(Open Issues), 13940 (Closed Issues)

3

Projects

8

Set up Digicert access for sre@,
GCS costs Q2 2020,
bigtable backup 2020,
Migrate expr from AWS to GCP,
Prod model GCP migration (groundwork),
Diagnose Mongo Issues Jan 2020,
Mongo Upgrade,
Ubuntu 18

4

Milestones

174

51(open), 123(closed)

5

Labels

116

 

6

Users

31

Users with “sift” in their username
(these are the unique assignee with “sift” in their username)

Prerequisites for migration

Following are the prerequisites required for migration,

  • User account with full admin access for APIs and data download from Github

  • User account with full admin access for pushing data to JIRA

  • Separate system on which we can login and work(This migration involves client data so we have to use a cloud instance rather than our local machine)

  • Confirmation of repository/repositories to be migrated

  • Confirmation of users to be migrated(Client has to confirm whether the user accounts should be created in Jira or not)

  • Confirmation of the data to be migrated as whole(all the entities ex. repository, issues, labels, milestones ….)

Approaches for migration

Pulling data from Github - Basically there are two ways to pull the data out of Github

  • First Way - Using the export option in the UI

Way to pull data

Using the export option in the UI

Details

There is an option to export the Github data in the account settings page. The data is exported in the compressed format. The archive will contain your profile data, plan, and any email addresses connected with your account in addition to the issues, pull requests, comments, reviews, releases, projects, events, attachments, milestones, settings and much more for each of your repositories along with basic information about the users who have interacted with them. This export should be done using the account with admin privileges so that you get all the data exported in the compressed format

Pros

It is easier to execute and requires less effort.
Contains data in JSON format which can be easily processed.

Cons

There are no ids(unique identifiers) available in the exported json data so we may not be able to cross verify with Github after the migration of data ends and in case if we have few failures.
It may become tedious as we may have to sift through many issues manually to cross verify the migrations. This can prove to be a hurdle while generating the report.

  • Second Way - Using the API to make the request and fetch the data

Way to pull data

Using developer API

Details

Github data can be perceived in terms of the follow hierarchy, You have organization at the top followed by repositories. Each repository has issues, comments and attachments. There are labels which can cut across different issue belonging to different repositories. Milestones belong to repository and there could be organization wide projects too shared by multiple repositories.

Pros

We can generate a report and cross verify the migrated data manually. We can hand over this report with ids of each entity to client so that they can cross verify at their end too.
We can have more control over what information we have to pull from Github
We will have access to unique ids for each entity in Github so that if something fails then we can identify what has exactly failed.

Cons

Extra effort is required to implement such a bridge to pull data from Github and generate a CSV file that Jira can understand
It is not possible to get the attachments using the api

Apis required for migration

GET /orgs/:org - get organization details by Id
GET /repos/:owner/:repo - get repositories by owner and repository id
GET /repos/:owner/:repo/issues - get issues of a repository
Most of this api have links in their responses to the object we need

  • Third Way (Hybrid Approach) - Using the manual export to fetch the data(Used to get attachment details) and using the api approach to fetch other details(Other than attachments)

Way to pull data

Using Organization data export and API

Details

We can use best of the above two approaches to get the work done. In this approach we can export the organization data using the manual exports (or organization migration api) to get hold of the attachments attached to issues(we cannot get attachments using apis) and then fetch the other information about the organization like repositories, issues, comments, labels, milestones, releases, projects etc using the relevant github apis

Pros

We can download all the data for an organization using this approach

Cons

Export option in github UI for any user account does not give the ability to export organization data. It only allows us to download the data which is belongs to that account and not to any organization.
We may have to use Github’s Async organization migration API to get around this problem

High level steps for pulling the data

  • Export the organization data using Either UI or organization migration api (for attachments)
    POST /orgs/:org/migrations - to start the process of generating the organization data
    GET /orgs/:org/migrations/:migrationId - to check the status of migration process
    GET /orgs/:org/migrations/:migrationId/archive - to download the compressed file with organization data and attachments

  • Us the Github api to pull the data for other entities(repositories, issues, comments, projects, milestones, labels)

Process of Migration

  • Download the data from Github (using either of the above three methods)

  • Transform the data by processing it

  • Write the transformed data to a csv file in a format understood by Jira

  • Upload the CSV data to Jira server and check if all the data was imported properly or not(We execute this step to make sure that we do not mess up cloud instance by directly importing data to cloud. We can fix server easily as we have more control over it)

  • If the CSV data was properly imported to Jira server than upload the same data to Jira cloud

 

 

Importing Data in JIRA

Data can be imported in Jira cloud using the CSV format. We can refer to the following link to generate the data in CSV format for both data and attachments.
Jira server - https://confluence.atlassian.com/adminjiraserver087/importing-data-from-csv-998872306.html
Jira Cloud - https://confluence.atlassian.com/adminjiracloud/importing-data-from-csv-776636762.html

Note : For importing attachments it is required to have the attachment data available over http/https so that it is accessible to Jira server/Cloud directly

High Level Tasks

Sl No

Task

Comment

Sl No

Task

Comment

1

Get credentials for admin for Github from client & Jira Access

Action required from Client

it is required to export the attachment data

2

Get separate VM for development work

Action required from Client

3

There should be a mapping document to show which field in Github map to which field Jira

Empyra team to create the mapping and get confirmation from the customer

4

Installations on development environment & testing of Github account credentials

Yes we have

5

There should be a way for the utility to pull data from Github

Yes we have

6

There should be a way for the utility to transform the data from Github format to Jira understandable format

Analysis is done. Transformation to be worked upon.

7

There should be a way for the utility to export the transformed data to CSV format understandable by JIRA

Action from Empyra team

8

There should be a way for the utility to generate report of data pulled from Github

Action from Empyra team

9

There should be a way for the utility to generate report of data exported to CSV format

Action from Empyra team

10

There should be a way for the utility to check the data in Jira cloud using api calls and generate report of data exported

Action from Empyra team

11

Testing

Action from Empyra team

 

Doubts and Clarifications

Question

Details

Clarification

Question

Details

Clarification

Do we need to migrate pull request details?

Github repositories also have pull requests marked raised by users. if we have to migrate them then they have to be migrated to Bitbucket cloud

No, There is no need to migrate repositories

Do we need to migrate the closed issues?

There are some closed issue present in the repository which somebody had already worked upon and have resolved them so do we have to migrate them also?

Yes

Do we have to create epics in Jira cloud for issues marked with label ‘epic’?

There are some issue with labels 'Epic' against them so do we have to create epic in Jira cloud for such issues?

Yes

Do we have to migrate users also?

There are users in the SiftScience accounts so Do we have to migrate these users too to Jira cloud?

Mapping between Github users and Jira users is required

What kind of ticket should we create in Jira for each issue in Github?

Github has only one issuetype called as issues and Jira has a separate field called as issuetypes, So how should we identify which one is a bug, task, story or any other type?

Open

Story

What if the issue types doesn’t have any label?

Does client uses Jira cloud or after github migration they will start using it?

Wanted to understand whether Jira cloud is having data/configuration at present or not.

Client is already using Jira Cloud

Do we have list of labels that we need to support for migration?

The labels in Github should be migrated to Jira. Now Sift uses labels itself for different purposes. Some of them could be used to mark issuetypes like bugs and feature requests while others may be used for marking them with specific technology of development.

Yes we have the list

What if a issue in github is marked with two labels which are considered as different projects?

ex. if an issue AP-1 has two labels project 1 and project 2 under it in Github then while moving it to Jira under which Jira project should this issue be put? project 1 or project 2?

Generate a report of such issues having more than one labels(project)

Dependency from the Client

Item

Details

Status

Item

Details

Status

Project List

Client has to create three more projects

Done

Labels

Done

Done

Issue Types

Done

Done

Field Mapping from JIRA to Github

 

Done

Milestone

Done(We have only one sprint in JIRA. We will have to migrate all the issues under this sprint)

No Need to migrate Milestones

User List

We need usernames of the users from JIRA

Done

Access to JIRA

 

Done

Tasks for POC

Task

Details

Aim

Status

Task

Details

Aim

Status

Migrate : 10 Closed and 10 Open issues

Migrate: 5 Milestones and 5 project with attachments

In Gihub, create two four milestones and add few issues under each
Now close two of the milestones and keep other two open.
Also tag issues in these milestone with projects
try to migrate the data

To check if we can create closed sprints in Jira or not

 

Migrate User data

In Github invite some users to join the organization
Migrate these users to JIRA

To check if we have enough information to create a user in Jira from Github

 

Migrate Comments

In Github create comments under different issues.
Migrate these comments under issues in JIRA

To check if we can migrate the comments under a project and a issue or not

 

Migrate Attachments

In github create an attachment, Migrate this attachment to JIRA

To check if we can migrate the attachment to proper issue or not

 

Check the release version

Try to migrate releases from Github to Jira

To see if we can migrate releases data or not

 

 

Work Breakdown(Stories & Estimations)

#

Story

zTasks

Status

#

Story

zTasks

Status

1

Utility should be able to pull data from Github

 

 

 

 

Utility should be able to pull api data from Github

Done

 

 

Utility should be able to download the migration data

Done

 

 

Utility should be able to save the data locally to the hard drive

Done

 

 

Utility should be able to join the organization migration(attachment) data to data pulled from api

Done

 

 

Utility should be able to generate the report based on data pulled from Github

Done

2

Utility should be able to transform the data in JIRA compatible format

 

 

 

 

Utility should be able to read and transform the data into CSV format compatible with JIRA

Done

 

 

Utility should be able to generate the report of the transformed data

Done

3

Configure the test JIRA server instance to emulate the cloud

 

 

 

 

Create users in the jira instance

Done

 

 

Create all projects with same name as labels in Github and custom fields

Done

 

 

Person should be able to import the data in Jira server

Done

 

 

Utility should be able to generate a report to validate the data in JIRA server against the data imported in CSV format

Done

 

 

Validate the migrated data

Done

4

Import the transformed data into Jira cloud

 

 

 

 

Person should be able to import the data in JIRA cloud

Done

 

 

Validate the migrated data

Done