首页 > 数据分析 > 使用mrsdeploy在Azure上运行您的R代码

[悬赏]使用mrsdeploy在Azure上运行您的R代码 (已翻译31%)

查看 (340次)
英文原文:Running your R code on Azure with mrsdeploy
标签: 数据分析 Azure R语言
admin 发布于 2017-03-28 16:34:41 (共 13 段, 本文赏金: 39元)
参与翻译(2人): cyt5969858 廿九_ 默认 | 原文

【待悬赏】 赏金: 6元

Let’s say you’ve built a model in R that is larger than you can conveniently run locally, and you want to take advantage of Azure’s resources simply to run it on a larger machine. This blog explains how to provision and run an Azure virtual machine (VM) for this, using the mrsdeploy library that comes installed with Microsoft’s R Server. We will work specifically with the Unbuntu Linux version of the VM, so I you’ll need to be familiar with working with superuser privileges at the command line in Linux, and of course, familiar with R.

The fundamental architecture consists of your local machine as the client for which you create a server machine in the Cloud. You’ll set up a service on the remote machine — the one in the cloud. Once you do this, you needn’t interact directly with the remote machine; instead you issue commands to it and see the results returned at the client. This is one approach; there are many many ways this can be done in Azure, depending on your choice of language, reliance on coding, capabilities of the service, and complexity and scale of the task. A data scientist typically works first interactively to explore data on an individual machine, then puts the model thus built into production at scale, in this example, in the Cloud. The purpose of this posting is to clarify the deployment process, or as it is called, in a mouthful, operationalization. In short, using a VM running the mrsdeploy library in R Server lets you operationalize your code with little effort, at modest expense.

Alternatively, instead of setting up a service with R server, one unadvisedly could just provision a an bare virtual machine, and login into it as one would any remote machine with the manual encumbrance of having to work with multiple machines, load application software, and move data and code back and forth. But that’s what we avoid. The point of the Cloud is making large data and compute as much as possible like working on your local computer.




【待悬赏】 赏金: 5元

Deploying Microsoft R Server (MRS) on an Azure VM

Azure Marketplace offers a Linux VM (Ubuntu version 16.04) preconfigured with R Server 2016. Additionally the Linux VM with R Server comes with mrsdeploy, a new R package for establishing a remote session in a console application and for publishing and managing a web service written in R. In order to use the R Server’s deployment and operationalization features, one needs to configure R Server for operationalization after installation, to act as a deployment server and host analytic web services.

Alternately there are other Azure platforms for operationalization using R Server in the Marketplace, with other operating systems and platforms including HDInsight, Microsoft’s Hadoop offering. Or, equivalently one could use the Data Science VM available in the Marketplace, since it has a copy of R Server installed. Configuration of these platforms is similar to the example covered in this posting.

Provisioning an R Server VM, as reference in the documentation, takes a few steps that are detailed here, which consist of configuring the VM and setting up the server account to authorize remote access. To set up the server you’ll use the system account you set up as a user of the Linux machine. The server account is used for client interaction with the R Server, and should not be confused with the Linux system account. This is a major difference with the Windows version of the R Server VM that uses Active Directory services for authentication.



【待悬赏】 赏金: 2元

Provisioning a machine from the Marketplace

You will want to do the install of a Unbuntu Marketplace VM with R server preinstalled. The best way to find it on portal.azure.com is to search for “r server”:

R_server_ubuntu_marketplace
R Server in the Marketplace

Select the Ubuntu version. Do a conventional deployment—lets say you name yours mymrs. Take note of the mymrs-ip public address, and the mymrs-nsg network security group resources created for it since you will want to customize them.

Login to the VM using the system account you set up in the Portal, and add these aliases, one for the path to the version of the R executable, MRS (aka Revo64), and one for the mrsdeploy menu-driven administration tool.

alias rserver='/usr/local/bin/Revo64-9.0'
alias radmin='sudo /usr/local/bin/dotnet \
/usr/lib64/microsoft-deployr/9.0.1/Microsoft.DeployR.Utils.AdminUtil/Microsoft.DeployR.Utils.AdminUtil.dll'

The following are a set of steps to bring up on the VM a combined web-compute server (a “one-box” server) that can be accessed remotely.



【已悬赏】 赏金: 1元

1.检查你是否能运行微软R服务器(MRS)

就用MRS作为别名吧

$ rserver
[Note a line in the banner saying
"Loading Microsoft R Server packages, ..."]

这是一个简单的测试,说明MRS库是预加载和运行的。注意MRS库(“rx”函数)是预加载的。

> rxSummary(formula = ~., data = iris)
cyt5969858
翻译于 52天前
 

参与本段翻译用户:
cyt5969858

显示原文内容

【待悬赏】 赏金: 3元

2. Set up the MRS server for mrsdeploy

mrsdeploy operationalization runs two services, the web node and one or more compute nodes. In the simplest configuration, the one described here, both “nodes” are services running on same VM. Alternately, by making these separate, larger loads can be handled with one web node and one or more compute nodes.

Use the alias you created for the admin tool.

$ radmin

This utility brings up a menu

*************************************
Administration Utility (v9.0.1)
*************************************

1. Configure R Server for Operationalization
2. Set a local admin password
3. Stop and start services
4. Change service ports
5. Encrypt credentials
6. Run diagnostic tests
7. Evaluate capacity
8. Exit

Web node endpoint: **http://localhost:12800/**

Please enter an option:
1

Set the admin password:
*************

Confirm this password:
*************

Configuration for Operationalization:

A. One-box (web + compute nodes)
B. Web node
C. Compute node
D. Reset machine to default install state
E. Return to main menu

Please enter an option:
A

Success! Web node running (PID: 4172)

Success! Compute node running (PID: 4172)

At this point the setup should be complete. Running diagnostics with the admin tool can check that it is.

Run Diagnostic Tests: A. Test Configuration

Please enter an option:
6

Preparing to run diagnostics...
***********************
DIAGNOSTIC RESULTS:
***********************
Overall Health: pass

Web Node Details:
Logs: /usr/lib64/microsoft-deployr/9.0.1/Microsoft.DeployR.Server.WebAPI/logs
Available compute nodes: 1

Compute Node Details:
Health of 'http://localhost:12805/': pass
Logs: /usr/lib64/microsoft-deployr/9.0.1/Microsoft.DeployR.Server.BackEnd/logs


Authentication Details:
A local admin account was found. No other form of authentication is configured.

Database Details:
Health: pass
Type: sqlite

Code Execution Test: PASS Code: ‘y <- cumprod(c(1500, 1+(rnorm(n=25,mean=.05, sd = 1.4)/100)))’

Yes, it even tests that the MRS interpreter runs! If the web or the service had stopped the following test will complain loudly. Note the useful links to the log directories for failure details. Services can be stopped and started from selection 3 in the top level menu.

Run Diagnostic Tests: B. Raw Server Status

**********************
SERVICE STATE (raw):
**********************

Please authenticate...

Username:
admin

Password:
*************
Server:
Health: pass
Details:
    logPath: /usr/lib64/microsoft-deployr/9.0.1/Microsoft.DeployR.Server.WebAPI/logs
backends:
    Health: pass
    http://localhost:12805/:
    Health: pass
    Details:
        maxPoolSize: 80
        activeShellCount: 1
        currentPoolSize: 5
        logPath: /usr/lib64/microsoft-deployr/9.0.1/Microsoft.DeployR.Server.BackEnd/logs
database:
    Health: pass
    Details:
    type: sqlite
    name: main
    state: Open


【已悬赏】 赏金: 3元

3. 验证MRS服务器是否能在Linux Linux提示符下运行

R服务器的网络服务也可以通过查看计算机的开放端口进行检查,不需要管理员工具。此命令显示Linux机器正在监听的端口:

$ netstat - tupln

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:29130         0.0.0.0:*               LISTEN      42527/mdsd
tcp        0      0 127.0.0.1:29131         0.0.0.0:*               LISTEN      2001/mdsd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1265/sshd
tcp        0      0 0.0.0.0:9054            0.0.0.0:*               LISTEN      55348/Rserve
tcp        0      0 0.0.0.0:9055            0.0.0.0:*               LISTEN      55348/Rserve
tcp6       0      0 :::12805                :::*                    LISTEN      55327/dotnet
tcp6       0      0 :::22                   :::*                    LISTEN      1265/sshd
tcp6       0      0 :::12800                :::*                    LISTEN      55285/dotnet
udp        0      0 0.0.0.0:68              0.0.0.0:*                           1064/dhclient

我们可以发现12800端口对网络服务是活跃的。12805端口是计算服务器,与网络服务在同一台机器上运行。

你接下来该做的是看看你是否可以本机运行R服务器时连接到服务,并加载mrsdeploy。

cyt5969858
翻译于 52天前
 

参与本段翻译用户:
cyt5969858

显示原文内容

【已悬赏】 赏金: 3元

4. 通过从服务器本身登录,检查MRS服务器是否正在运行

通过以localhost为服务器运行一个远程的mrsdeploy会话。这是一种人们能“运行MRS作为R客户端”的方式,尽管这一整套的MRS功能都是可用的。在同一台机器上运行MRS并将其同时作为客户端和服务器是可能的,但除了测试网络服务是可访问的之外,我认为没有别的用途。这些步骤的顺序是:

$ rserver
    [ MRS banner...]

    > endpoint <- "localhost:12800"   # The forum shows this format for logins.
    > library(mrsdeploy)
    > remoteLogin(endpoint)
    Username: admin
    Password: *************           # The password you set in the admin tool. 

    [...]

    REMOTE>

如果验证失败,您可以查看系统日志文件的尾部找出错误,比如这样

$ cd /usr/lib64/microsoft-deployr/9.0.1/Microsoft.DeployR.Server.WebAPI/logs
$ sudo tail $(ls -t1 | head -1)   # Look at the end of the most recent logfile
... "Message":"The username doesn't belong to the admin user",...

然后,结束远程会话,命令是exit'

REMOTE> exit
cyt5969858
翻译于 52天前
 

参与本段翻译用户:
cyt5969858

显示原文内容

【待悬赏】 赏金: 4元

5. Finish VM Configuration for remote access

Another two steps are needed before you can use the server over the network. You should set the public DNS (e.g. domain) address since the VM’s public IP address is dynamic and may change when the machine is restarted. And as a matter of security, the Azure firewall (the “network security gateway” resource) needs to be configured.

Go back to the portal.azure.com and find these resources associated with the VM: - Public DNS address - Open incoming service ports

Public IP

To set the public DNS name, go to the portal’s VM overview pane and click on the public-IP item, for instance, “mymrs-ip”:

VM_overview

until you get to the configuration blade:

Ip_configuration

This will send you to the mymrs-ip component where you can change the DNS label.

Network Security Group

If you don’t do this, a remote mrsdeploy login attempt will fail with a message

Error: Couldn't connect to server

since only the port 22 for ssh is allowed by default for the VM’s network security gateway. One option is to use ssh to set up port forwarding. I won’t explain that here. The alternative is to configure remote access on the server. For this you’ll need to open the port the admin tool reported as the web endpoint, typically 12800. The inbound security rules’ blade is buried in the VM choices -> Network Interfaces -> Network Security Group -> Inbound Security Rules. Choose “Add” to create a custom inbound rule for TCP port 12800. The result looks like this:

Inbound_security_rules

Now the server is ready for use!



【待悬赏】 赏金: 3元

6. Check that the MRS server is running from another machine

You’ll need a local copy of MRS to do this. Copies are available from a few sources, including a “client side only” copy called, naturally–R Client that is licensed for free. R Client gives you all the remoting capabilities of R Server, also the same custom learning algorithms available with R Server, but unlike R Server, it is limited to datasets that fit in-memory.

The sources of R Server are several:

  • MSDN subscription downloads include R Server for diferent platforms
  • Also R Client is a free download on MSDN.
  • Microsoft SQL Server comes with R Server as an option. You can install R Server “standalone” with the SQL Server installer in addition to installing it as part of SQL Server.
  • If you have installed R Tools for Visual Studio (RTVS), the R Tools menu has an item to install R Client.
  • Of course any VM that comes with R Server will work too. Notably, the Data Science VM, which hosts an exhaustive collection of data science tools includes a copy of R Server .

To remotely login from your local machine, the MRS commands are the same as before, except use the domain name of the server from your local client:

> endpoint <- "mymrs.southcentralus.azure.com:12800'
> library(mrsdeploy)
> remoteLogin(endpoint)

If as shown, you do not include the admin account and passwords as arguments to remoteLogin the command will bring up a modal dialog asking you for them. Be advised that this dialog may be hidden and not come to the front, and you’ll have to look for it.



【已悬赏】 赏金: 2元

服务器将返回您的客户端和服务器MRS环境之间的差异的标题。下面是正确的远程会话在启动时返回的内容:

Diff report between local and remote R sessions...

Warning! R version mismatch
local: R version 3.3.2 (2016-10-31)
remote: R version 3.2.3 (2015-12-10)

These R packages installed on the local machine are not on the remote R instance:

   Missing Packages
1        checkpoint
2  CompatibilityAPI
3              curl
...
23            RUnit

The versions of these installed R packages differ:

     Package   Local  Remote
1       base   3.3.2   3.2.3
...
23     utils   3.3.2   3.2.3


Your REMOTE R session is now active.
Commands:
        - pause() to switch to local session & leave remote session on hold.
        - resume() to return to remote session.
        - exit to leave (and terminate) remote session.
cyt5969858
翻译于 52天前
 

参与本段翻译用户:
cyt5969858

显示原文内容

【待悬赏】 赏金: 3元

Once at the REMOTE> prompt you can explore the remote interpreter environment. These handy R functions let you explore the remote environment further:

Sys.getenv()    # will show the machine's OS environment variables on the server.
Sys.info()      # returns a character string with machine and user descriptions

Environment differences: Adding custom packages to the server

The comparative listing of package when you log into the remote should alert you to the need to accommodate the differences between local and remote environments. Different R versions generate this warning:

Warning! R version mismatch

Different versions will limit which packages are available for both versions.

Compatible but missing packages can be installed on the server. To be able to install packages when available packages differ, the remote session will need permission to write to one of the directories identified by .libPaths() on the remote server. This is not granted by default. If you feel comfortable with letting the remote user make modifications to the server, you could grant this permission by making this directory writable by everyone

$ sudo chmod a+w /usr/local/lib/R/site-library/

Then to specify a library, for example, glmnet to be installed in this directory use

REMOTE> install.packages("glmnet", lib="/usr/local/lib/R/site-library")

These installations will persist from one remote session to another, and the “missing packages” warning at login will be updated correctly, although strangely, intellisense for package names always refers to the local list of packages, so will make suggestions that are unavailable at the remote.



【待悬赏】 赏金: 3元

Running batch R job on the server

Congratulations! Now you can run large R jobs on a VM in the cloud!

There are various uses for the server to take advantage of the VM, in addition to running interactively at the REMOTE> prompt. A simple case is to take advantage of the remote server to run large time-consuming jobs. For instance, this interation, to compute a regression’s leave-one-out r-squared values—

rsqr <- c()
    system.time(
      for (k in 1:nrow(mtcars)) {
        rsqr[k] <- summary(lm(mpg ~ . , data=mtcars[-k,]))$r.squared
    })
    print(summary(rsqr))

—can be done the same remotely:

remoteExecute("rsqr <- c()\
    system.time(\
      for (k in 1:nrow(mtcars)) {\
        rsqr[k] <- summary(lm(mpg ~ . , data=mtcars[-k,]))$r.squared\
    })")

We’ll need to recall the results separately, since only the last value in the remote expression output is printed:

remoteExecute("summary(rsqr)")

For larger chunks of code, you can include them in script files, and execute the file remotely by use mrsdeploy::remoteScript("myscript.R") which is simply a wrapper around mrsdeploy::remoteExecute("myscript.R", script=TRUE), where myscript.R is found in your local working directory.

Note that the the mrsdeploy library is not needed in the script running remotely. Indeed, the VM with preinstalled Microsoft R Server 2016 (version 9.0.1) for Linux (Ubuntu version 16.04) runs R version 3.2.3, which does not include the mrsdeploy library. So both library(mrsdeploy) and install.packages(“mrsdeploy") will generate an error on the remote session. If you’ve included these statements to enable your local script, be sure to remove them if you execute the script remotely, or the script will fail! If you want to use the same script in both places, a simple workaround is to avoid making the library call in the script when it runs in the remore session:

if ( Sys.info()["user"] != "rserve2" ) {
      library(mrsdeploy)
    }

The ability of mrsdeploy to execute a script remotely is just the tip of the iceberg. It also enables moving files and variables back and forth between local and remote, and most importantly, configuring R functions as production web services. This set of deployment features merits another entire blog posting.



【待悬赏】 赏金: 1元

For more information

For details about different configuration options see Configuring R Server Operationalization. Libraries as required in the Operationalization instructions are already configured on the VM.

To see what you can do with a remote session, have a look here.. And, for a general overview see this..

Go to Rserver documentation for the full API reference.

共1人翻译此段 (待审批1人)


参与本段翻译用户:
廿九_

GMT+8, 2018-1-22 14:32 , Processed in 0.046667 second(s), 11 queries .