Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new AMI fails to launch the spark cluster #227

Closed
xiandong79 opened this issue Dec 1, 2017 · 3 comments
Closed

new AMI fails to launch the spark cluster #227

xiandong79 opened this issue Dec 1, 2017 · 3 comments

Comments

@xiandong79
Copy link

I would like to run BigDL on spark cluster https://bigdl-project.github.io/master/#ProgrammingGuide/run-on-ec2/. The AMI is

wx20171201-122522

ERROR info is:

Do you want to terminate the 5 instances created by this operation? [Y/n]: n
[107.23.150.80] bash: line 2: python: command not found

My config is

provider: ec2

services:
  spark:
    version: 2.2.0

  hdfs:
    version: 2.7.3

launch:
  num-slaves: 4

providers:
  ec2:
    key-name: Virginia-us-east-1
    identity-file: /Users/dong/Virginia-us-east-1.pem
    instance-type: m4.large
    region: us-east-1
    ami: ami-8c87099a
    user: ubuntu
@xiandong79 xiandong79 changed the title new AMI new AMI fails to launch the spark cluster Dec 1, 2017
@nchammas
Copy link
Owner

nchammas commented Dec 1, 2017

python: command not found

Looks like you are using an AMI that does not have Python, which Flintrock requires. To work around this, you can use the --ec2-user-data option to install Python on instance launch. You can read more about this EC2 feature here.

Unless you are comfortable navigating these types of issues, I suggest sticking to the default Amazon Linux AMI. It will work out of the box.

@nchammas nchammas closed this as completed Dec 1, 2017
@xiandong79
Copy link
Author

So why not check installation of python before installing spark on the instance.

If not installed, then install it first.

@nchammas
Copy link
Owner

nchammas commented Dec 1, 2017

That's a good idea, and Flintrock already does this for things that are not commonly available by default like Java 8. But Python is available by default on almost every major Linux distribution I can think of. Flintrock also depends on yum, Bash, curl, and perhaps a few other commonly available tools.

If someone wants to use a barebones AMI that doesn't include these basic tools, I think that's fine, but it should probably be on them to install the missing tools themselves.

Looking at the documentation for this BigDL AMI though, I see that it's already supposed to include Python. So perhaps Python is indeed available, but under a different name like python3 instead of python. If that's the case, simply creating an alias via the --ec2-user-data script would fix the issue.

All that said, perhaps Flintrock should check for these remote dependencies explicitly, similar to how it does for local dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants