Plot daily time spent on codereview.stackexchange.comView for Ultimate Tic-Tac-Toe boardType-safe Dictionary for various typesCalculate questions per day on CodeGolf.SE“How can I make this SEDE query better?” (bad title finder)Capitalize the first character of all words (even when following a '-')Is nesting grids a good idea?Telling fibonaccis to the userSelf-playing Baseball gameEuler's Totient Function in x86 assembly (MASM)Fast 32x32 bit multiply on ARM M0C++11 Any classSudoku solver in C++Reusable Unit Of Work Interface / FactoryIs the LINQ version faster than the foreach one?Simon, Mat's Mug, what's the (rating) difference?C++ API for interfacing with LuaStack implemented using linked list in x86_64 assemblyHow can I make my stack monad faster?Sudoku using 'exact cover' solverA header-only linear-time C++11 PEG parser generator supporting left-recursion and grammar ambiguityWebsockets client code and making it production-readyCustom session handler class - is it robust enough?Inverse transform samplingDefinitional Returns. Solved. MostlyOwn implementation of Lazy<T> objectClose to the metal C++ Palindrome ProgramFirst Scala FizzBuzz implementationCreate a C style char** from a C++ vector<string>HashSet and TreeSetIf she floats then she is not a witch like we thoughtCountdown Code: 'League of Legends'Monte Carlo pi calculationExporting information on a collection of books to ExcelGuessing a unique 4 random digits numberCardshifter game lobbyReverse a String in JavaIs it bad practice to increment more than one variable in a for loop declaration?Passing parameters by reference? Let me fix that for youAdding a course to one of 8 periodsAverage time to first answerAre there any open vulnerabilities in this mailer script?Reusing StringBuilder or creating a new one? [closed]Printing 1,000,000 numbers in 1 sec. in PythonObject Oriented Design of Card DeckPython review_generatorSimple method to detect int overflow“ONCE”, “UPON”, “A”, “TIME”RPG skeleton, Part 1 - The CharacterSimple card game to learn OOP

这将从从xyplot API获得的csv数据中提取一个RescueTime，用于“ http://codereview.stackexchange.com”的活动时间。

# -------------------------------------------------------
# PLOT TIME SPENT ON CODEREVIEW SE
# USING RESCUE TIME API AND R
# -------------------------------------------------------

# -------------------------------------------------------
# LIBRARIES
if(!require("RCurl")){
  install.packages("RCurl")
}
if(!require("lattice")){
  install.packages("lattice")
}
require("RCurl")
require("lattice")
# -------------------------------------------------------

print("Initializing...")

# -------------------------------------------------------
# RESCUE TIME API ACCESS

# api key 
# get from https://www.rescuetime.com/anapi/manage
api_key <- "..."

# query from date
from_date <- "2014-01-01"

# query to date
to_date <- format(Sys.Date(), "%Y-%m-%d")

# activity to get details of
activity <- "codereview.stackexchange.com"


# --api description--
# for more details see https://www.rescuetime.com/anapi/setup/documentation
# key - API key
# format - Output format
# rb - Restrict Start Date ("%Y-%m-%d")
# re - Restrict End Date ("%Y-%m-%d")
# rk - Restrict Kind (in this case it's activity)
# pv - Perspective (in this case it's interval)
# rs - Resolution Time (if it is day it's a daily report)
# rt - Restrict Thingy (Restrict Kind of Type)
url_format  <- "https://www.rescuetime.com/anapi/data?key=%s&format=csv&rb=%s&re=%s&rk=activity&pv=interval&rs=day&rt=%s"

# create url
api_url  <- sprintf(url_format,api_key,from_date,to_date,activity)
# -------------------------------------------------------

# -------------------------------------------------------
# PLOT

# get csv as a text
csv_text_data <- textConnection(getURL(api_url))

# parse csv
activity_data <- read.csv(csv_text_data,header = TRUE)


print(sprintf("Mean Time Spent on '%s':%f (seconds)",activity,mean(activity_data$Time))) 

activity_time_spent <- activity_data$Time
activity_dates <- activity_data$Date

# create an index for dates
activity_date_index  <- 1:length(activity_dates)

# plot
print(xyplot(activity_time_spent ~ activity_date_index, type="b", xlab = "Date Index", ylab = "Time (seconds)" ))
# -------------------------------------------------------

# -------------------------------------------------------
# CLEAN UP
detach("package:RCurl", unload = TRUE)
detach("package:lattice", unload = TRUE)
rm(list = ls())
# -------------------------------------------------------

这是否遵循R约定？

通过使用source("./codereview.R")将其加载到R控制台中来执行

下载

CSV和R文件

输出

Loading required package: RCurl 
Loading required package: bitops
Loading required package: lattice 
[1] "Initializing..." 
[1] "Mean Time Spent on 'codereview.stackexchange.com':2764.559322 (seconds)"

相关Meta：meta.codereview.stackexchange.com/questions/2521/…

#1 楼

这是我将重写您的代码的方法。我将在以下内容添加我的评论：

plotActivity <- function(
   activity  = 'codereview.stackexchange.com',
   api_key   = getOption('rescuetime_api_key'),
   from_date = NULL,
   to_date   = NULL) {

## This function plots time spent on a given URL
## using the rescue time API
##
## Arguments:
##    - activity:  the url you want to check for usage
##    - api_key:   user API key, you can get yours at
##                 https://www.rescuetime.com/anapi/manage
##    - from_date: a Date or string in "YYYY-MM-DD" format,
##                 defaults to January 1st of the same year as to_date
##    - to_date:   a Date or string in "YYYY-MM-DD" format,
##                 defaults to today
##    - verbose:   adds verbose, defaults to FALSE

   library("RCurl")
   library("ggplot2")

   # arguments parsing and default settings
   if (is.null(api_key)) stop("please set your rescue time API key...")

   if (is.null(to_date)) to_date <- Sys.Date()
   to_date <- as.Date(to_date)
   if (is.na(to_date)) stop("error parsing to_date...")

   if (is.null(from_date)) from_date <- format(to_date, "%Y-01-01")
   from_date <- as.Date(from_date)
   if (is.na(from_date)) stop("error parsing from_date...")

   # build the API request url
   # for more details see https://www.rescuetime.com/anapi/setup/documentation

   base_url  <- "https://www.rescuetime.com/anapi/data"
   arguments <- c(key = api_key,   # API key
                  format = 'csv',  # Output format
                  rb = from_date,  # Restrict Start Date
                  re = end_date,   # Restrict End Date
                  rk = 'activity', # Restrict Kind
                  pv = 'interval', # Perspective
                  rs = 'day',      # Resolution Time
                  rt = activity)   # Restrict Thingy (Restrict Kind of Type)

   arg_str <- paste(names(arguments), arguments, sep = '=', collapse = '&')
   api_url <- sprintf("%s?%s", base_url, arg_str)

   # read data
   csv_text_data <- textConnection(getURL(api_url))
   activity_data <- read.csv(csv_text_data, header = TRUE)

   # print diagnostic and plot
   cat(sprintf("Total Time Spent between %s and %s on '%s':%f (seconds)",
               activity, from_date, end_date, sum(activity_data$Time)))

   activity_data$Date <- as.Date(activity_data$Date)
   ggplot(activity_data, aes(x=Date, y=Time.Spent..seconds.)) +
      geom_bar(stat="identity") +
      xlab("Date") +
      ylab("Time (seconds)")
}

所以这里有一些想法：

我从脚本转到了功能。这样，使用和共享变得容易得多。我已经抽象出了我认为所有正确的输入和合理的默认值。有趣的是getOption使用了api_key，因此用户可以通过运行options(rescuetime_api_key = "xyz123abc")一次对其进行一次设置。所以现在，您要做的就是从将要存储此函数的文件中获取源代码，然后调用该函数。函数在执行时会在自己的环境中运行，因此在退出函数时，会删除运行时创建的所有变量。您的全局环境永远不会受到污染，这消除了对您的“清理”部分的需要（如janos所指出的那样，这是非常有害的。）
像rm(list=ls())一样，强制通过install.packages()安装软件包有点有害。不，您只想使用library，它将立即在那里消失，使用户可以决定安装软件包。有一个不错的博客，介绍为什么有些人-包括我-喜欢library而不是require：http://yihui.name/en/2014/07/library-vs-require
无论我在哪里使用R的Date对象
我重写了api url的构建方式，并使用了命名向量作为参数。带有内联注释。希望您会同意该代码更具可读性和易于维护性，这是您应始终追求的目标。
我将print替换为cat。这样，您就不会获得打印矢量时附带的[1]前缀。
这与代码审阅没有多大关系，但我认为查看网站上的总时间会更有用。每天的平均时间可能有点令人困惑：用户可能想知道，是包括整天还是仅包含访问网站的那一天（您选择了后者）。而总时间消除了这种歧义。另外，我认为将数据绘制为具有实时刻度的条形图会更有用，即在不访问该网站的期间包含空洞。我用的是ggplot2，请参见下图。 x轴可能看起来很奇怪，但是您确实在7月1日花了六秒钟在网站上，所以这就是它从那天开始的原因，尽管您不会真正注意到那里有数据点。

我的改写摆脱了activity_date_index <- 1:length(activity_dates)，但是如果我需要它，我会使用activity_date_index <- seq_along(activity_dates)。像seq_len或seq_along这样的函数比1:n更好的实践，因为当n为零时，它们可以正确处理特殊情况。他们返回一个空向量（c()），而1:n将返回向量c(1, 0)并造成破坏。

免责声明：我没有该抢救时间网站的帐户，所以我无法完全测试我的代码。如果发现错误，请随时编辑我的帖子，我不在乎。

感谢您共享代码，我喜欢R，特别喜欢看到人们使用它来连接和分析在线数据库。就像你在这里一样。希望我的评论对您有所帮助。

\ $ \ begingroup \ $
好的答案，请参加第二聊天室。我们需要更多的R人。
\ $ \ endgroup \ $
–bhathiya-perera
14-10-1在6:02

#2 楼

不错的倡议！而且R对于这类东西是完美的。

条件逻辑

if(!require("lattice")){
  install.packages("lattice")
}
require("lattice")

如果第一个require是成功，则无需再次调用。您可以在if内部移动第二个：

if(!require("lattice")){
  install.packages("lattice")
  require("lattice")
}

编码样式

对于编码样式，您可能会发现Google的R样式指南很有趣。 br />
如果阅读过多，请注意，其他语言中的许多常规建议也对R有意义。

示例＃1：在参数列表中的逗号后加空格：if语句中打开括号和关闭括号之前放置一个空格：

# not so good
api_url  <- sprintf(url_format,api_key,from_date,to_date,activity)

# better
api_url <- sprintf(url_format, api_key, from_date, to_date, activity)

依此类推。

清理

我敢肯定最后清理的目的是最好的。
这可能是有害的。
/>
如果在运行脚本之前已经加载了RCurl和lattice库，又希望在之后加载它们呢？

这可能是完全有害的：

# not so good
if(!require("RCurl")){

# better
if (!require("RCurl")) {

这将删除全局名称空间中的所有对象。
如果您在runni之前有重要的内容在此脚本中，
它会丢失。

我看到的清除代码的唯一用途是，如果在运行该脚本后，您想继续在一个干净的会话中工作。
但是，退出并重新启动会容易得多。

未来的代码回顾

下载链接中包含的CSV文件包含59行。
还不错，您可以将它逐字地包含在问题中。
我没有使用真实的API进行测试，
检索到的方法没有太多测试仍然是URL。
我只是将代码更改为如下所示：

rm(list = ls())

\ $ \ begingroup \ $
print（xyplot（...））在脚本中是必需的，它返回格状图对象；参见R FAQ 7.22
\ $ \ endgroup \ $
–rcs
14-10-16在7:26

\ $ \ begingroup \ $
真的吗？好了，将其删除，感谢您的指出！
\ $ \ endgroup \ $
– janos
14-10-16在7:27

编程黑洞网

评论

#1 楼

评论

#2 楼

评论