Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Questions on Implementing Data Mesh & Open Questions on User Roles, Data Access, Data Mgmt (types of data - geospatial, etc) #312

Open
HeatherAck opened this issue Jun 5, 2023 · 10 comments
Assignees
Labels
documentation Improvements or additions to documentation high priority

Comments

@HeatherAck
Copy link
Contributor

HeatherAck commented Jun 5, 2023

Solutions Architect Questions for Implementation of OS-Climate's Data Mesh

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Jun 5, 2023

Open Questions based on Technology - OS-C Solutions Architect

Data Mesh Implementation Questions

As the highest need is for an operational instance the decision is use the current data commons 1.0 as a stable environment as that has been tested
Data exchange being the first real app to use this with a customer will migrate to the this instance
Mikhail will create a new cluster for Data exchange to promote to a higher environment (Need to define)
Data exchange will provide use cases to create a security profile

For the data mesh we dont want to diverge , but we need a strategy to get there .
Data mesh need to provide a roadmap which is in context with all the OSC offering. However when we migrate to this it has to be tested. Mikhail will connect with Vincent to establish a roadmap.

Data Exchange

  • [ x] What is needed to implement the API to connect Data Exchange to Openmetadata (@marius)
  • [x ] Mikhail has provided https://das-openmetadata.apps.odh-cl2.apps.os-climate.org/ this is a Dev version. Eric can move his to OSC , Mikhail will facilitate , the process is git-ops
  • What is needed for Data Exchange to connect to OECD (SAMEPATH) & RMI data (Data Providers)

Open MetaData

  • What version should we upgrade to? Upgrade to current version as noted in data mesh pattern: 1.0.3
  • We should follow the data mesh pattern 1.0.3-

Trino

  • What version of Trino should we upgrade to? Upgrade to current version as noted in data mesh pattern.
    Long term we should move to Data mesh pattern.

Data Storage

  • If not federated, where will data be stored?

User / Access Management Questions

Keycloak
Discussion points: What is the identity/authentication provider of record? (i.e. not GitHub, which is currently used). Authorization is split amongst several technologies: KeyCloak, Trino, OpenMetadata, other?
How can we apply Role-Based and Attribute-based permissions (RBAC) across Trino, OM, KeyCloak, Phys Risk clients

  • Should we use LDAP? No we will use keycloak
  • What roles need to be created in Keycloak? Data Consumer, Data Producer to start; access to data based on data license rules

Namespace

  • Need specifics on how to define Namespace

DevOps

  • Should we use Gitlab? No, we will use Github

Data Access

  • What is the process for users to get access to data?
  • Application/user onboarding mechanism

Permissions & Roles

  • How do roles and permissions federate to the other components? Based on use cases from Data Exchange, Phys Risk and then augment them going forward (e.g., PhysRisk users - where do they get their API key). Marius to get back to Nick and Eric B, - mapping that onto design.
  • Configuration - github identity provider, teams/organization - who defines: Marius to provide info.

Security

  • Do we have different security patterns based on data type, roles etc. Can those be configured amongst the various technologies? How to handle situations where there are X Phys Risk apps communicating securely with 1 OSC Data Exchange?
  • [ ]13th June 2023 Eric to provide use case to Marius
  • Are we planning to use the same IDP or realm for authentication , authorization between applications and for administration purposes vs external users/applications who consume data commons services.
  • Security between Data-Exchange , phys risk and data commons

@HeatherAck
Copy link
Contributor Author

@mbogoevici can you please help define the answers to the component questions?

@HeatherAck HeatherAck changed the title Open Questions on Implementing Data Mesh Pattern Open Questions on Implementing User Roles, Data Access, Data Mgmt (types of data - geospatial, etc) as it relates Data Mesh Pattern Jun 12, 2023
@HeatherAck HeatherAck changed the title Open Questions on Implementing User Roles, Data Access, Data Mgmt (types of data - geospatial, etc) as it relates Data Mesh Pattern Open Questions on Implementing Data Mesh & Open Questions on User Roles, Data Access, Data Mgmt (types of data - geospatial, etc) Jun 12, 2023
@HeatherAck
Copy link
Contributor Author

The ArgoCD plugins now working, but having issues with Fybrik and Pachyderm are showing degraded. Take a look at events in ArgoCD. Will look at openshift console for any issues - look at objects; CLI can use kube describe. After fix the degraded issues, next run app of apps for airflow, jupyter, etc. set up S3, SSO - focus on these steps for week of 26-June

@HeatherAck
Copy link
Contributor Author

Resolved most of ArgoCD issues - operator pachyderm not working, need to install group. rebuilding elyra images. still need to set up time with @mbogoevici - @ryanaslett to set up time with him on 31-Jul.

@eb-oss eb-oss removed their assignment Aug 23, 2023
@HeatherAck
Copy link
Contributor Author

HeatherAck commented Sep 18, 2023

@jpaulrajredhat to update the opendatahub issues today. @redmikhail resending url to @jpaulrajredhat
https://console-openshift-console.apps.osc-cl4.apps.os-climate.org/

@jpaulrajredhat
Copy link

@HeatherAck who should be contact person for cluster 4 installation. Looks like tools are partially installed and some of the components are in failed state .

@HeatherAck
Copy link
Contributor Author

@ryanaslett (and @grigarr his manager) are the primary key contacts.

@jpaulrajredhat
Copy link

@HeatherAck Could you please setup meeting with @ryanaslett . I need to understand how far installation completed and which components / tools failed. I can see the tooling installation itself some of the components failed due space issue.

@HeatherAck
Copy link
Contributor Author

@jpaulrajredhat - he is free any day after 11AM PT - what works best for you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation high priority
Projects
Status: In Progress
Development

No branches or pull requests

8 participants