Преобразование строки в таблицу в R


Я пытаюсь преобразовать предыдущие официальные отчеты Центрального банка в табличный формат. У меня есть следующий скребок:

library(rvest)
library(dplyr)

url <- "http://nationalbank.kz/?docid=105&cmomdate=2009-05-15&switch=english"

p <- url %>%
  read_html() %>%
  html_nodes(xpath='//table[1]') %>%
  html_table(fill = T)

gh = p[[11]]
str(gh)
txt = gh[, 1]

Который производит:

[1] "GOVERNMENT SECURITIES PLACEMENT RESULT 15.05.2009rn
              GOVERNMENT SECURITIES PLACEMENT RESULTrnrnThe
National Bank of the Republic of Kazakhstan announces the placement result
on the following parameters:rnrnType of securitytNotes
NBKrnNINtKZW1KD281882rnMaturityt28 daysrnType of
placementtAuctionrnDate of placementt15.05.2009rnSettlement
datet15.05.2009rnRedemption datet12.06.2009rnActual amount of
placementt24 999 999 991.30 tengernt251 003 524
(quantity)rnDemandt127 493 096 130.40 tengernt1 280 053 174
(quantity)rnWeighted-averaged pricet99.60 tengernCut pricet99.59
tengernYield (coupon)t5.24 %"

Я ищу помощь в преобразовании этой строки в следующий формат таблицы:

Type of security    NIN Maturity    Type of placement   Date of placement   Settlement date Redemption date Actual amount of placement      Demand      Weighted-averaged price Cut price   Yield (coupon)
Notes NBK   KZW1KD281882    28 days Auction 15.05.2009  15.05.2009  12.06.2009  24 999 999 991.30 tenge 1 280 053 174 (quantity)    127 493 096 130.40 tenge    1 280 053 174 (quantity)    99.60 tenge 99.59 tenge 5.24%

Я пробовал некоторые функции, используя gsub(), но не смог приблизиться к желаемому выходу.

1 2

1 ответ:

Будет ли достаточно следующего?

ans <- lapply(strsplit("GOVERNMENT SECURITIES PLACEMENT RESULT 15.05.2009\r\n
              GOVERNMENT SECURITIES PLACEMENT RESULT\r\n\r\nThe
National Bank of the Republic of Kazakhstan announces the placement result
on the following parameters:\r\n\r\nType of security\tNotes
NBK\r\nNIN\tKZW1KD281882\r\nMaturity\t28 days\r\nType of
placement\tAuction\r\nDate of placement\t15.05.2009\r\nSettlement
date\t15.05.2009\r\nRedemption date\t12.06.2009\r\nActual amount of
placement\t24 999 999 991.30 tenge\r\n\t251 003 524
(quantity)\r\nDemand\t127 493 096 130.40 tenge\r\n\t1 280 053 174
(quantity)\r\nWeighted-averaged price\t99.60 tenge\r\nCut price\t99.59
tenge\r\nYield (coupon)\t5.24 %", "\r\n", fixed=TRUE),

    function(x) strsplit(x, split="\t", fixed=TRUE))

do.call(rbind, lapply(ans[[1]], function(x) {
    if(length(x)==2) {
        return(x)
    }
    return(NULL)
}))

#       [,1]                          [,2]                       
#  [1,] "Type of security"            "Notes\nNBK"               
#  [2,] "NIN"                         "KZW1KD281882"             
#  [3,] "Maturity"                    "28 days"                  
#  [4,] "Type of\nplacement"          "Auction"                  
#  [5,] "Date of placement"           "15.05.2009"               
#  [6,] "Settlement\ndate"            "15.05.2009"               
#  [7,] "Redemption date"             "12.06.2009"               
#  [8,] "Actual amount of\nplacement" "24 999 999 991.30 tenge"  
#  [9,] ""                            "251 003 524\n(quantity)"  
# [10,] "Demand"                      "127 493 096 130.40 tenge" 
# [11,] ""                            "1 280 053 174\n(quantity)"
# [12,] "Weighted-averaged price"     "99.60 tenge"              
# [13,] "Cut price"                   "99.59\ntenge"             
# [14,] "Yield (coupon)"              "5.24 %"