Feature: Spark mixed mode support #350

jafreck · 2018-01-24T01:11:01Z

paselem · 2018-02-07T00:43:26Z

cli/config.py

+                "You must configure a VNET to use AZTK in mixed mode (dedicated and low priority nodes). Set the VNET's subnet_id in your cluster.yaml.")
+
+        # ensure spark_client is built with AAD if using mixed mode
+        if not spark_client.secrets_config.service_principal.tenant_id and self.mixed_mode:


Minor, since AAD is a requirement for subnets, then I would raise this error first.

paselem · 2018-02-07T00:44:00Z

cli/spark/endpoints/cluster/cluster_create.py

@@ -128,6 +125,7 @@ def print_cluster_conf(cluster_conf):
             cluster_conf.size + cluster_conf.size_low_pri)
    log.info(">        dedicated:      %s", cluster_conf.size)
    log.info(">     low priority:      %s", cluster_conf.size_low_pri)
+    log.info("mixed mode:              %s", cluster_conf.mixed_mode)


I'm not sure I would expose this. I don't think it really adds any value to the user.

paselem

I didn't see it here, but we should also expose the type of VM when a user does 'aztk spark cluster get --id cluster'. Each dedicated VM should be labelled as such (same with low pri).

jafreck · 2018-02-07T01:54:32Z

Looks like this now, what do you think @paselem?

Cluster         sst7
------------------------------------------
State:          active
Node Size:      standard_f2
Nodes:          2
| Dedicated:    2
| Low priority: 0

|               Nodes                |        State        |        IP:Port       | Dedicated  |  Master  |
|------------------------------------|---------------------|----------------------|------------|----------|
| tvm-2576887678_1-20180207t010308z  |        idle         |   40.79.57.90:50000  |     *      |          |
| tvm-2576887678_2-20180207t010308z  |        idle         |   40.79.57.90:50001  |     *      |    *     |

paselem

Looks good

* Feature: on node user creation (#303) * client side on node user creation * start create user on node implementation * fix on node user creation * remove debug statements * remove commented code * line too long * fix spinner password prompt ui bug * set wait to false by default, formatting * encrypt password on client, decrypt on node * update docs, log warning if password used * Fix list-apps crash (#364) * Allow submitting jobs into a VNET (#365) * Add subnet_id to job submission cluster config * add some docs * Feature: Spark mixed mode support (#350) * add support for aad creds for storage on node * add mixed mode support * add docs * switch error order * add dedicated to get_cluster * remove mixed mode in print_cluster_conf * Feature: spark init docker repo customization (#358) * customize docker_repo based on init args * whitespace * add some docs * r-base to r * case insensitive r flag, typo fix * Bug: Load default Jars for job submission CLI (#367) * load jars in .aztk/ by default * rewrite loading config files * Feature: Cluster Run and Copy (#304) * start implementation of cluster run * fix cluster_run * start debug sequential user add and delete * parallelize user creation and deletion, start implementation of cluster scp * continue cluster_scp implementation * debug statements, disconnect error: permission denied * untesteed parakimo implementation of clus_run * continue debugging user creation bug * fix bug with pool user creation, start concurrent implementation * start fix of paramiko cluster_run and cluster_copy * working paramiko cluster_run implementation, start cluster_scp * fix cluster_scp command * update requirements, rename cluster_run function * remove unused shell functions * parallelize run and scp, add container_name, create logs wrapper * change scp to copy, clean up * sort imports * remove asyncssh from node requirements * remove old import * remove bad error handling * make cluster user management methods private * remove comment * remove accidental commit * fix merge, move delete to finally clause * add docs * formatting * Feature: Refactor cluster config to use ClusterConfiguration model (#343) * Bug: fix core-site.xml typo (#378) * fix typo * crlf->lf * Bug: fix regex for is_gpu_enabled (#380) * fix regex for is_gpu_enabled * crlf->lf * Bug: spark SDK example fix (#383) * start fix sdk * fix sdk example * crlf->lf * Fix: Custom scripts not read from cluster.yaml (#388) * Feature: spark shuffle service (#374) * start shuffle service by default * whitespace, delete misplaced file * crlf->lf * crlf->lf * move spark scratch space off os drive * Feature: enable dynamic allocation by default (#386) * Bug: stop using mutable default parameters (#392) * Bug: always upload spark job logs errors (#395) * Bug: spark submit upload error log type error (#397) * Bug: Spark Job list apps exit code 0 (#396) * Bug: fix spark-submit cores args (#399) * Fix: Trying to add user before master is ready show better error (#402) * Bug: move spark.local.dir to location usable by rstudioserver (#407) * Feature: SDK support for file-like configuration objects (#373) * add support for filelike objects for conifguration files * fix custom scripts * remove os.pathlike * merge error * Feature: Basic Cluster and Job Submission SDK Tests (#344) * add initial cluster tests * add cluster tests, add simple job submission test scenario * sort imports * fix job tests * fix job tests * remove pytest from travis build * cluster per test, parallel pytest plugin * delete cluster after tests, wait until deleted * fix bugs * catch right error, change cluster_id to base_cluster_id * fix test name * fixes * move tests to intregration_tests dir * update travis to run non-integration tests * directory structure, decoupled job tests * fix job tests, issue with submit_job * fix bug * add test docs * add cluster and job delete to finally clause * Feature: Spark add worker on master option (#415) * Add worker_on_master to ClusterConfiguration * add worker_on_master to JobConfiguration * Feature: task affinity to master node (#413) * Release: v0.6.0 (#416) * update changelog and version * underscores to stars

jafreck added 2 commits January 23, 2018 15:12

add support for aad creds for storage on node

6231fe9

add mixed mode support

a013ea8

jafreck added in progress work in progress labels Jan 24, 2018

jafreck mentioned this pull request Jan 24, 2018

Feature: Spark Mixed Mode #308

Closed

Merge branch 'master' into feature/spark-mixed-mode-support

a4989ea

jafreck added needs docs and removed work in progress labels Jan 24, 2018

jafreck added 2 commits January 25, 2018 12:12

Merge branch 'master' into feature/spark-mixed-mode-support

5337999

add docs

69e59cd

jafreck removed the needs docs label Jan 25, 2018

Merge branch 'master' into feature/spark-mixed-mode-support

6ec80c9

jafreck added this to the v0.5.2 milestone Feb 1, 2018

merge upstream/master into feature/spark-mixed-mode-support

258ede0

jafreck requested a review from paselem February 6, 2018 21:08

paselem reviewed Feb 7, 2018

View reviewed changes

paselem suggested changes Feb 7, 2018

View reviewed changes

jafreck added 2 commits February 6, 2018 17:39

switch error order

45e7082

add dedicated to get_cluster

45ed707

paselem approved these changes Feb 7, 2018

View reviewed changes

remove mixed mode in print_cluster_conf

26766df

jafreck merged commit 748a126 into Azure:master Feb 7, 2018

jafreck removed the in progress label Feb 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Spark mixed mode support #350

Feature: Spark mixed mode support #350

jafreck commented Jan 24, 2018

paselem Feb 7, 2018

paselem Feb 7, 2018

paselem left a comment

jafreck commented Feb 7, 2018

paselem left a comment

Feature: Spark mixed mode support #350

Feature: Spark mixed mode support #350

Conversation

jafreck commented Jan 24, 2018

paselem Feb 7, 2018

Choose a reason for hiding this comment

paselem Feb 7, 2018

Choose a reason for hiding this comment

paselem left a comment

Choose a reason for hiding this comment

jafreck commented Feb 7, 2018

paselem left a comment

Choose a reason for hiding this comment